Sao10k

Models by this creator

↗️

L3-8B-Stheno-v3.2

145

The L3-8B-Stheno-v3.2 is an experimental AI model created by Sao10K that is designed for immersive roleplaying and creative writing tasks. It builds upon previous versions of the Stheno model, with updates to the training data, hyperparameters, and overall performance. Compared to the similar L3-8B-Stheno-v3.1 model, v3.2 incorporates a mix of SFW and NSFW writing samples, more instruction/assistant-style data, and improved coherency and prompt adherence. The L3-8B-Stheno-v3.1-GGUF-IQ-Imatrix variant also offers quantized versions for lower VRAM requirements. Another related model, the Fimbulvetr-11B-v2 from Sao10K, is a solar-based model focused on high-quality 3D renders and visual art generation. Model inputs and outputs The L3-8B-Stheno-v3.2 model is a text-to-text generation model designed for interactive roleplaying and creative writing tasks. It takes in prompts, system instructions, and user inputs, and generates relevant responses and story continuations. Inputs Prompts**: Short text descriptions or instructions that set the context for the model's response System instructions**: Guidelines for the model's persona and expected behavior, such as roleplaying a specific character User inputs**: Conversational messages or story continuations provided by the human user Outputs Narrative responses**: Creative, coherent text continuations that advance the story or conversation Character dialogue**: Believable, in-character responses that maintain the model's persona Descriptive details**: Vivid, immersive descriptions of scenes, characters, and actions Capabilities The L3-8B-Stheno-v3.2 model excels at open-ended roleplaying and storytelling tasks. It is capable of handling a wide range of scenarios, from fantastical adventures to intimate character interactions. The model maintains a strong sense of character and can fluidly continue a narrative, adapting to the user's prompts and inputs. Compared to earlier versions, v3.2 demonstrates improved handling of NSFW content, better assistant-style task performance, and enhanced multi-turn coherency. The model is also more adept at following prompts and instructions while still retaining its creative flair. What can I use it for? The L3-8B-Stheno-v3.2 model is well-suited for a variety of interactive, text-based experiences. Some potential use cases include: Roleplaying games**: The model can serve as an interactive roleplaying partner, responding to user prompts and advancing the story in real-time. Creative writing collaborations**: Users can work with the model to co-create engaging narratives, with the model generating compelling continuations and descriptive details. Conversational AI assistants**: The model's ability to maintain character and engage in natural dialogue makes it a potential candidate for more advanced AI assistants. Things to try One interesting aspect of the L3-8B-Stheno-v3.2 model is its ability to handle a mix of SFW and NSFW content. Users can experiment with prompts that explore the model's range, testing its capabilities in both tasteful, family-friendly scenarios as well as more mature, adult-oriented situations. Another avenue to explore is the model's performance on assistant-style tasks, such as answering questions, providing explanations, or offering advice. Users can try crafting prompts that challenge the model to demonstrate its knowledge and problem-solving skills in a more practical, non-fiction oriented context. Overall, the L3-8B-Stheno-v3.2 model offers a versatile and engaging platform for immersive text-based experiences. Its combination of creative storytelling and adaptable conversational abilities make it a promising tool for a variety of applications.

Updated 7/2/2024

Text-to-Text

👀

Fimbulvetr-11B-v2

Sao10K

110

The Fimbulvetr-11B-v2 model is a large language model created by the AI researcher Sao10K. It is a solar-based model trained on a mix of publicly available online data. The model is available in Alpaca or Vicuna prompt formats and is recommended to be used with SillyTavern presets for the Universal Light variation. Similar models include the Llama-2-7B-GGUF model created by TheBloke, which is a 7 billion parameter model from Meta's Llama 2 collection that has been converted to the GGUF format. Another related model is the Phind-CodeLlama-34B-v2-GGUF model, a 34 billion parameter model created by Phind that has been optimized for programming tasks. Model inputs and outputs The Fimbulvetr-11B-v2 model accepts text-based prompts in either the Alpaca or Vicuna format. The Alpaca format involves providing an instruction prompt, input context, and a request for the model to generate a response. The Vicuna format involves providing a system message that sets the tone and guidelines for the interaction, followed by a user prompt for the model to respond to. Inputs Prompt**: Text-based prompts in either the Alpaca or Vicuna format, providing instructions and context for the model to generate a response. Outputs Generated text**: The model will generate coherent text in response to the provided prompt, adhering to the guidelines and tone set in the system message. Capabilities The Fimbulvetr-11B-v2 model is capable of generating high-quality text in response to a wide variety of prompts, from open-ended conversations to more specific tasks like answering questions or providing explanations. The model has been trained to be helpful, respectful, and honest in its responses, and to avoid harmful, unethical, or biased content. What can I use it for? The Fimbulvetr-11B-v2 model can be used for a variety of natural language processing tasks, such as: Chatbots and conversational AI**: The model can be used to power chatbots and other conversational AI systems, providing users with helpful and engaging responses. Content generation**: The model can be used to generate coherent and well-written text on a wide range of topics, such as articles, stories, or scripts. Question answering**: The model can be used to answer questions on a variety of subjects, drawing upon its broad knowledge base. To use the model, you can download the GGUF files from the TheBloke/Llama-2-7B-GGUF repository and integrate it into your own applications or projects. Things to try One interesting aspect of the Fimbulvetr-11B-v2 model is its solar-based training, which may imbue it with unique characteristics or capabilities compared to other large language models. Researchers and developers could explore how this solar-based training affects the model's performance on tasks like energy-related or sustainability-focused content generation, or how it responds to prompts related to renewable energy and environmental topics. Another intriguing area to investigate would be the model's ability to engage in open-ended, creative conversations. The provided Alpaca and Vicuna prompt formats suggest the model may be well-suited for imaginative roleplay or collaborative storytelling applications, where users can explore different narrative paths and scenarios with the model.

Updated 5/28/2024

Image-to-Text

🏷️

L3-8B-Stheno-v3.1

Sao10K

100

The Llama-3-8B-Stheno-v3.1 model is an experimental roleplay-focused model created by Sao10K. It was fine-tuned using outputs from the Claude-3-Opus model along with human-generated data, with the goal of being well-suited for one-on-one roleplay scenarios, RPGs, and creative writing. Compared to the original LLaMA-3 model, this version has been optimized for roleplay use cases. The model is known as L3-RP-v2.1 on the Chaiverse platform, where it performed well with an Elo rating over 1200. Sao10K notes that the model handles character personalities effectively for one-on-one roleplay sessions, but may require some additional context and examples when used for more broad narrative or RPG scenarios. The model leans toward NSFW content, so users should explicitly indicate if they want to avoid that in their prompts. Model inputs and outputs Inputs Textual prompts for chatting, roleplaying, or creative writing Outputs Textual responses generated by the model to continue the conversation or narrative Capabilities The Llama-3-8B-Stheno-v3.1 model excels at immersive one-on-one roleplaying, with the ability to maintain consistent character personalities and flowing prose. It can handle a variety of roleplay scenarios, from fantasy RPGs to more intimate interpersonal interactions. The model also demonstrates creativity in its narrative outputs, making it well-suited for collaborative storytelling and worldbuilding. What can I use it for? This model would be well-suited for applications focused on interactive roleplay and creative writing. Game developers could leverage it to power NPCs and interactive storytelling in RPGs or narrative-driven games. Writers could use it to aid in collaborative worldbuilding and character development for their stories. The model's uncensored nature also makes it potentially useful for adult-oriented roleplaying and creative content, though users should be mindful of potential risks and legal considerations. Things to try Try using the model to engage in open-ended roleplaying scenarios, either one-on-one or in a group setting. Experiment with providing it with detailed character backstories and see how it responds, maintaining consistent personalities and personalities. You could also challenge the model with more complex narrative prompts, such as worldbuilding exercises or branching storylines, to explore its creative writing capabilities.

Updated 6/26/2024

Text-to-Text

❗

L3-70B-Euryale-v2.1

Sao10K

The L3-70B-Euryale-v2.1 is a large language model created by Sao10K, a prominent AI model developer and maintainer. This 70 billion parameter model is designed as a more capable sibling to Sao10K's previous L3-8B-Stheno-v3.1 and L3-8B-Stheno-v3.2 models, with enhanced capabilities in areas like prompt adherence, anatomy/spatial awareness, and adapting to unique formatting. As described on the Sao10K's maintainer profile, this model was trained over 8 NVIDIA H100 SXM GPUs and aims to be a "big brained version of Stheno." Model inputs and outputs The L3-70B-Euryale-v2.1 model can handle a variety of text-based inputs, from simple prompts to more complex multi-turn exchanges and roleplay scenarios. It is particularly well-suited for tasks like creative writing, storytelling, and 1-on-1 roleplay interactions. Inputs Prompts**: The model can accept prompts of varying complexity, from simple instructions to detailed scenario descriptions. Conversations**: The model can engage in multi-turn conversations, maintaining coherence and context across exchanges. Roleplay scenarios**: The model can seamlessly inhabit specific character roles and continue roleplay interactions. Outputs Creative writing**: The model can generate original stories, descriptions, and narratives based on provided prompts. Dialogue and roleplay**: The model can produce natural-sounding dialogue and roleplay responses that are tailored to the given context and characters. Formatting and structure**: The model can adapt its outputs to unique formatting requirements, such as specific templates or reply structures. Capabilities The L3-70B-Euryale-v2.1 model excels at tasks that require a combination of creativity, contextual awareness, and adaptability. It has been described as having better prompt adherence, improved anatomy and spatial awareness, and the ability to generate unique and varied responses. Compared to its predecessor L3-8B-Stheno-v3.2, this model is more "big brained" and capable of handling subtler nuances and contexts. What can I use it for? The L3-70B-Euryale-v2.1 model is well-suited for a variety of creative and interactive applications, such as: Roleplaying and story generation**: The model can be used to facilitate immersive roleplaying experiences, where users can engage in detailed character interactions and collaborative storytelling. Creative writing and worldbuilding**: The model's strong narrative capabilities make it a useful tool for writers, authors, and worldbuilders who need to generate rich, detailed content. Virtual assistants and chatbots**: The model's adaptability and conversational skills could be leveraged to create advanced virtual assistants or chatbots for customer service, education, or entertainment purposes. Things to try One key aspect of the L3-70B-Euryale-v2.1 model is its ability to handle unique formatting and reply structures. Experimenting with different templates, such as the provided Euryale-v2.1-Llama-3-Instruct preset for SillyTavern, can help unlock the model's full potential in interactive scenarios. Additionally, exploring the recommended sampler settings, including temperature, min_p, and repetition penalty, can help fine-tune the model's creative output and response quality.

Updated 7/18/2024

Image-to-Image

↗️

Fimbulvetr-11B-v2-GGUF

Sao10K

Fimbulvetr-11B-v2-GGUF is a large language model created by Sao10K, who maintains a profile at https://aimodels.fyi/creators/huggingFace/Sao10K. This model is a version 2 update to the Fimbulvetr-11B model, and includes additional GGUF quant files from contributor mradermacher. The model is described as a "Solar-Based Model" and is fine-tuned on Alpaca or Vicuna prompt formats. Model inputs and outputs Fimbulvetr-11B-v2-GGUF is an image-to-text model, capable of generating text descriptions based on provided images. The model can handle both Alpaca and Vicuna prompt formats, with recommended SillyTavern presets for Universal Light. Inputs Images, with Alpaca or Vicuna formatted prompts Outputs Text descriptions generated in response to the input images and prompts Capabilities The Fimbulvetr-11B-v2-GGUF model has been trained to generate text descriptions for images. It can handle a variety of image types and prompts, and is capable of producing coherent and relevant text outputs. What can I use it for? The Fimbulvetr-11B-v2-GGUF model could be useful for applications that require generating text captions or descriptions for images, such as in photo sharing apps, social media, or e-commerce platforms. The model's flexibility in handling different prompt formats also makes it suitable for integration into chatbots or virtual assistants. Things to try One interesting thing to try with the Fimbulvetr-11B-v2-GGUF model would be experimenting with different types of images and prompts to see how it handles various input scenarios. You could also try fine-tuning the model on a specific domain or task to see if it can improve performance in those areas.

Updated 5/28/2024

Image-to-Text

🎯

L3-8B-Lunaris-v1

Sao10K

The L3-8B-Lunaris-v1 is a generalist / roleplaying model merge based on Llama 3, created by maintainer Sao10K. This model was developed by merging several existing Llama 3 models, including Meta-Llama/Meta-Llama-3-8B-Instruct, crestf411/L3-8B-sunfall-v0.1, Hastagaras/Jamet-8B-L3-MK1, maldv/badger-iota-llama-3-8b, and Sao10K/Stheno-3.2-Beta. This model is intended for roleplay scenarios, but can also handle broader tasks like storytelling and general knowledge. It is an experimental model that aims to balance creativity and logic compared to previous iterations. Model inputs and outputs Inputs Text prompts Outputs Generative text outputs, including dialog, stories, and informative responses Capabilities The L3-8B-Lunaris-v1 model is capable of engaging in open-ended dialog and roleplaying scenarios. It can build upon provided context to generate coherent and creative responses. The model also demonstrates strong general knowledge, allowing it to assist with a variety of informative tasks. What can I use it for? This model can be a useful tool for interactive storytelling, character-driven roleplay, and open-ended conversational scenarios. Developers may find it valuable for building applications that involve natural language interaction, such as chatbots, virtual assistants, or interactive fiction. The model's balanced approach to creativity and logic could make it suitable for use cases that require a mix of imagination and reasoning. Things to try One interesting aspect of the L3-8B-Lunaris-v1 model is its ability to generate varied and unique responses when prompted multiple times. Developers may want to experiment with regenerating outputs to see how the model explores different directions and perspectives. It could also be worthwhile to provide the model with detailed character information or narrative prompts to see how it builds upon the context to drive the story forward.

Updated 8/7/2024

Text-to-Text

🖼️

Ramble

Sao10K

Ramble is a multi-purpose AI model created by the maintainer Sao10K. It is similar to other models maintained by Sao10K such as Fimbulvetr-11B-v2, L3-8B-Stheno-v3.2, and L3-8B-Stheno-v3.1. These models span a range of capabilities including text generation, story writing, roleplaying, and task-oriented assistance. Model inputs and outputs Ramble is a text-to-text model, meaning it takes text as input and generates new text as output. The model was trained on a large corpus of online text data, including blogs, forums, and other user-generated content. Inputs Free-form text**: The model can accept any text-based input, from short prompts to longer passages. Outputs Generated text**: Based on the input, the model will produce new text in a coherent and contextual manner. This can include stories, conversations, or task-oriented responses. Capabilities Ramble demonstrates strong capabilities in areas like informal writing, personal anecdotes, and stream-of-consciousness style narratives. It can engage in freeform conversations, provide task-oriented assistance, and even generate creative fictional content. The model's outputs often have a natural, conversational tone. What can I use it for? Ramble could be useful for a variety of applications, such as: Creative writing**: Generating story ideas, dialogue, and descriptive passages for fiction, screenplays, or other creative projects. Personal blogging**: Producing relatable, conversational content for a personal blog or online journal. Customer service chatbots**: Providing friendly, contextual responses to user inquiries in a customer service or support setting. Roleplay and interactive fiction**: Engaging in freeform roleplaying scenarios or text-based adventures. Things to try One interesting aspect of Ramble is its ability to generate stream-of-consciousness style text, similar to a person's inner monologue. Providing the model with open-ended prompts about personal experiences, thoughts, or observations can result in engaging, human-like narratives. Another intriguing use case is leveraging Ramble's natural conversational abilities for interactive storytelling or roleplaying. Giving the model a character prompt and engaging it in dialogue can lead to immersive, collaborative experiences.

Updated 9/6/2024

Text-to-Text

🏋️

MN-12B-Lyra-v1

Sao10K

The MN-12B-Lyra-v1 is an experimental general roleplaying model developed by Sao10K. It is a merge of two different Mistral-Nemo 12B models, one focused on instruction-following and the other on roleplay and creative writing. The model scored well on the EQ-Bench, ranking just below the Nemomix v4 model. Sao10K found that a temperature of 1.2 and a minimum probability of 0.1 works well for this model, though they also note that it can perform well at lower temperatures. The model was created by merging two differently formatted training datasets - one on Mistral Instruct and one on ChatML. Sao10K found that keeping the datasets separate and using the della_linear merge method worked best, as opposed to mixing the datasets together. They also note that the base Nemo 12B model was difficult to train on their datasets, and that they would likely need to do some stage-wise fine-tuning in the future. Model inputs and outputs Inputs Either [INST] or ChatML input formats work well for this model. Outputs The MN-12B-Lyra-v1 model generates text outputs in a general roleplaying and creative writing style. Capabilities The MN-12B-Lyra-v1 model excels at general roleplaying tasks, with good performance on the EQ-Bench. Sao10K notes that the model can handle a context length of up to 16K tokens, which is sufficient for most roleplaying use cases. What can I use it for? The MN-12B-Lyra-v1 model would be well-suited for creative writing, storytelling, and roleplaying applications. Its ability to generate coherent and engaging text could make it useful for applications like interactive fiction, collaborative worldbuilding, or even as a foundation for more advanced AI-driven narratives. Things to try One interesting aspect of the MN-12B-Lyra-v1 model is Sao10K's observation that the base Nemo 12B model was difficult to train on their datasets, and that they would likely need to do some stage-wise fine-tuning in the future. This suggests that the model may benefit from a more iterative or multi-stage training process to optimize its performance on specific types of tasks or datasets. Sao10K also notes that the model's effective context length of 16K tokens may be a limitation for some applications, and that they are working on further iterations to improve upon this. Trying the model with longer context lengths or more advanced prompt engineering techniques could be an interesting area of exploration.

Updated 9/18/2024

Text-to-Text

🤷

Llama-3.1-8B-Stheno-v3.4

Sao10K

The Llama-3.1-8B-Stheno-v3.4 model is a text generation AI model created by the maintainer Sao10K. This model has gone through a multi-stage finetuning process, first on a multi-turn Conversational-Instruct dataset, and then on Creative Writing and Roleplay datasets. The model is built on top of the Llama 3.1 base model and has a distinctive style compared to previous Stheno versions. Similar models created by Sao10K include the L3-8B-Stheno-v3.1, L3-8B-Stheno-v3.3-32K, and L3-8B-Stheno-v3.2. These models share similar training approaches and capabilities, with variations in the datasets used and the overall model size. Model inputs and outputs Inputs The model accepts text inputs in a specific format, using the "L3 Instruct Formatting - Euryale 2.1 Preset" for best results. Prompts should be formatted with temperature and min_p parameters, typically in the range of 1.4 temperature and 0.2 min_p. Outputs The model generates text responses based on the input prompt, with a distinctive style and personality compared to previous Stheno versions. The outputs can vary in length and tone, with the model demonstrating good multi-turn coherency and the ability to handle a range of scenarios, from roleplaying to creative writing. Capabilities The Llama-3.1-8B-Stheno-v3.4 model excels at text generation tasks that require a blend of instruction following, creativity, and personality. It can handle multi-turn conversations, engage in roleplay scenarios, and produce coherent and varied creative writing. The model has been trained to have a strong adherence to system prompts and to demonstrate good reasoning and spatial awareness capabilities. What can I use it for? The Llama-3.1-8B-Stheno-v3.4 model can be a valuable tool for a variety of text-based applications, such as interactive storytelling, creative writing assistants, and roleplaying chatbots. Its strong adherence to system prompts and ability to handle multi-turn interactions make it well-suited for use in virtual assistant or conversational AI applications. Additionally, the model's emphasis on creativity and personality could make it useful in entertainment or artistic applications, such as generating unique and engaging narrative content. Things to try One interesting aspect of the Llama-3.1-8B-Stheno-v3.4 model is its ability to generate varied and unique responses when prompted with the same input. By leveraging this feature, users can experiment with regenerating responses to see how the model's outputs evolve and change based on factors like temperature or repetition penalty. Additionally, exploring the model's capabilities in specific scenarios, such as roleplaying or creative writing tasks, can help uncover its strengths and potential use cases.

Updated 9/18/2024

Text-to-Text

🚀

L3-8B-Stheno-v3.3-32K

Sao10K

The L3-8B-Stheno-v3.3-32K is a language model developed by Sao10K and trained with compute from Backyard.ai. It is an iterative improvement over previous versions of the Stheno model, with a focus on enhancing roleplaying capabilities, creative writing, and overall coherency. Compared to the earlier Stheno-v3.1 and Stheno-v3.2 models, this version integrates more training data and fine-tuning to address issues with long-context understanding and reasoning. Model inputs and outputs The L3-8B-Stheno-v3.3-32K is a text-to-text model, meaning it takes in textual prompts and generates textual responses. Inputs Textual prompts, including instructions, conversations, or creative writing scenarios Outputs Coherent and contextually relevant textual responses, ranging from roleplay dialogue to narrative storytelling Capabilities The L3-8B-Stheno-v3.3-32K excels at roleplaying and creative writing tasks, showcasing strong language generation capabilities and the ability to maintain consistent characterization over extended exchanges. While it has some limitations in long-form reasoning, the model performs well on many common language tasks and can be a valuable tool for interactive storytelling, collaborative worldbuilding, and other applications requiring flexible and imaginative text generation. What can I use it for? The L3-8B-Stheno-v3.3-32K model can be a useful asset for a variety of creative and interactive applications, such as: Roleplaying and interactive storytelling**: The model's strong grasp of character, tone, and narrative can make it a compelling partner for one-on-one roleplaying scenarios or collaborative worldbuilding exercises. Creative writing and ideation**: The model's generative capabilities can help spark new ideas, flesh out plot lines, or explore creative writing prompts. Conversational AI assistants**: With its ability to understand context and generate coherent responses, the model could be integrated into chatbots or virtual assistants for more natural and engaging interactions. Things to try One interesting aspect of the L3-8B-Stheno-v3.3-32K model is its sensitivity to prompt formatting and the inclusion of contextual information. By providing the model with detailed character profiles, setting details, or task-specific instructions, users can guide the model's responses and leverage its strengths in areas like roleplaying and creative writing. Experimenting with different prompting strategies and fine-tuning the model's sampling parameters can help unlock its full potential for various applications.

Updated 9/6/2024

Text-to-Text