MPT-7B-Storywriter-GGML

Maintainer: TheBloke

Last updated 5/28/2024

👁️

Property	Value
Run this model	Run on HuggingFace
API spec	View on HuggingFace
Github link	No Github link provided
Paper link	No paper link provided

Create account to get full access

Model overview

The MPT-7B-Storywriter-GGML is a version of the MPT-7B language model fine-tuned for story writing and long-form text generation. It was developed by MosaicML and is available in 4-bit, 5-bit and 8-bit GGML formats for efficient CPU and GPU inference. The model builds on the base MPT-7B architecture, which uses techniques like FlashAttention and ALiBi for fast training and inference. By fine-tuning on a dataset of long-form fiction, the MPT-7B-Storywriter-GGML model is optimized for generating coherent, engaging stories with extremely long context lengths.

Model inputs and outputs

Inputs

Raw text prompts for story generation

Outputs

Continued story text based on the provided prompt, with the ability to generate passages tens of thousands of tokens long.

Capabilities

The MPT-7B-Storywriter-GGML model excels at generating long-form fictional stories and narratives. It can take short prompts and continue them for thousands of tokens, maintaining coherence, plot, and character development throughout. The model's use of techniques like ALiBi allows it to handle context lengths far beyond the typical 2048 tokens seen in other language models.

What can I use it for?

The MPT-7B-Storywriter-GGML model is well-suited for applications that require long-form text generation, such as interactive storytelling, fiction writing assistance, and creative writing tools. Its ability to maintain coherence over extended passages makes it useful for generating novel-length stories or narratives from simple prompts.

Companies may find this model useful for building interactive fiction experiences, AI-generated books, or other creative content generation tools. The GGML format also allows for efficient on-device inference, opening up possibilities for mobile or embedded applications.

Things to try

One interesting thing to try with the MPT-7B-Storywriter-GGML model is to provide it with a short prompt - just a sentence or two - and see how it expands that into a lengthy, cohesive story. The model's strong grasp of narrative structure allows it to take simple beginnings and weave them into compelling tales. Experiment with different genres, character types, or story hooks to see the breadth of its creative capabilities.

Another avenue to explore is the model's ability to handle extremely long context lengths. Try providing it with a multi-paragraph prompt or even the full text of a short story, then have it continue the narrative. Observe how it maintains consistency and develops the story over hundreds or thousands of additional tokens.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

🎯

mpt-7b-storywriter-4bit-128g

OccamRazor

122

mpt-7b-storywriter-4bit-128g is a model designed to read and write fictional stories with super long context lengths. It was built by finetuning MPT-7B with a context length of 65k tokens on a filtered fiction subset of the books3 dataset. At inference time, thanks to ALiBi, mpt-7b-storywriter-4bit-128g can extrapolate even beyond 65k tokens. The maintainer OccamRazor demonstrates generations as long as 84k tokens on a single node of 8 A100-80GB GPUs in their blogpost. Model inputs and outputs mpt-7b-storywriter-4bit-128g is a text-to-text model that can be used to generate long-form fictional stories. It takes arbitrary text as input and outputs generated text. Inputs Arbitrary text prompt Outputs Continuation and elaboration of the input text in the style of a fictional story Capabilities mpt-7b-storywriter-4bit-128g excels at generating coherent, long-form fictional narratives. It can maintain context and plot coherence over extremely long text sequences, producing stories that can span tens of thousands of tokens. This makes it well-suited for applications that require the generation of lengthy, structured creative writing. What can I use it for? mpt-7b-storywriter-4bit-128g could be used to assist creative writers by generating story ideas, plot points, or even full narrative arcs that writers can then expand upon. It could also be used to create interactive fiction or text-based adventure games, where the model generates the narrative content dynamically based on user inputs. Additionally, the model's capabilities could be leveraged for educational purposes, such as helping students practice creative writing or analyze literary elements in fictional stories. Things to try One interesting aspect of mpt-7b-storywriter-4bit-128g is its ability to extrapolate beyond the 65k token context length it was trained on, thanks to the ALiBi technique. This means you can try feeding the model very long input texts and see how it continues the story, potentially generating coherent narratives that span tens of thousands of tokens. Experimenting with different prompts and genres could also yield interesting results and showcase the model's versatility in creative writing.

Updated Invalid Date

Text-to-Text

👨‍🏫

mpt-7b-storywriter

mosaicml

793

The mpt-7b-storywriter is a large language model developed by MosaicML that is designed to read and write fictional stories with very long context lengths. It was built by fine-tuning the base MPT-7B model on a filtered fiction subset of the books3 dataset. The model utilizes techniques like ALiBi to handle extrapolating beyond its 65k token training context length, demonstrating generations up to 84k tokens. The mpt-7b-storywriter model is part of the MosaicPretrainedTransformer (MPT) family, which uses a modified transformer architecture optimized for efficient training and inference. These architectural changes include performance-optimized layer implementations and the elimination of context length limits. The MPT models can be served efficiently with both standard Hugging Face pipelines and NVIDIA's FasterTransformer. Model Inputs and Outputs Inputs Text prompts of up to 65,536 tokens in length, thanks to the use of ALiBi Outputs Continued story text generation, with the ability to extrapolate beyond the 65k token training context length up to 84k tokens Capabilities The mpt-7b-storywriter model is designed to excel at generating long-form fictional stories. It can handle extremely long input contexts and produce coherent, extended narratives. This makes it well-suited for tasks like creative writing assistance, story generation, and even interactive storytelling applications. What Can I Use It For? The mpt-7b-storywriter model can be used for a variety of creative writing and storytelling applications. Some potential use cases include: Generating original story ideas and plot outlines Assisting human writers by producing narrative continuations and story extensions Creating interactive fiction or choose-your-own-adventure style narratives Developing conversational storytelling agents or interactive characters Things to Try One interesting aspect of the mpt-7b-storywriter model is its ability to handle extremely long input context lengths and produce cohesive, extended narratives. You could try providing the model with a short story prompt and see how it continues and develops the story over many thousands of tokens. Alternatively, you could experiment with giving the model partial story outlines or character descriptions and see how it fleshes out the narrative. Another intriguing possibility is to fine-tune or adapt the mpt-7b-storywriter model for specific genres, styles, or storytelling formats. This could involve further training on domain-specific datasets or incorporating custom prompting techniques to tailor the model's outputs.

Updated Invalid Date

Text-to-Text

📈

GPT4All-13B-snoozy-GGML

TheBloke

The GPT4All-13B-snoozy-GGML model is a 13-billion parameter language model developed by Nomic.AI and maintained by TheBloke. Like similar large language models such as GPT4-x-Vicuna-13B and Nous-Hermes-13B, it is based on Meta's LLaMA architecture and has been fine-tuned on a variety of datasets to improve its performance on instructional and conversational tasks. Model inputs and outputs The GPT4All-13B-snoozy-GGML model follows a typical language model input/output format. It takes in a sequence of text as input and generates a continuation of that text as output. The model can be used for a wide range of natural language processing tasks, from open-ended conversation to task-oriented instruction following. Inputs Text prompts of varying length, from single sentences to multi-paragraph passages Outputs Continued text in the same style and tone as the input, ranging from short responses to multi-paragraph generations Capabilities The GPT4All-13B-snoozy-GGML model is capable of engaging in open-ended conversation, answering questions, and following instructions across a variety of domains. It has been fine-tuned on datasets like ShareGPT, WizardLM, and Alpaca-CoT, giving it strong performance on tasks like roleplay, creative writing, and step-by-step problem solving. What can I use it for? The GPT4All-13B-snoozy-GGML model can be used for a wide range of natural language processing applications, from chatbots and virtual assistants to content generation and task automation. Its strong performance on instructional tasks makes it well-suited for use cases like step-by-step guides, task planning, and procedural knowledge transfer. Researchers and developers can also use the model as a starting point for further fine-tuning or customization. Things to try One interesting aspect of the GPT4All-13B-snoozy-GGML model is its ability to engage in open-ended and imaginative conversations. Try prompting it with creative writing prompts or hypothetical scenarios and see how it responds. You can also experiment with providing the model with detailed instructions or prompts and observe how it breaks down and completes the requested tasks.

Updated Invalid Date

Image-to-Text

🔎

WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GGML

TheBloke

The WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GGML is a powerful language model created by TheBloke. This model is a triple model merge of WizardLM Uncensored, Reasoning CoT (Chain-of-Thought), and Storytelling, resulting in a comprehensive boost in reasoning and story writing capabilities. Model inputs and outputs The WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GGML model takes natural language prompts as input and generates coherent, contextual responses. The model can handle a wide variety of tasks, from open-ended conversations to specific queries and creative story generation. Inputs Natural language prompts, questions, or instructions Outputs Coherent, contextual responses in natural language Detailed, multi-paragraph answers to complex queries Creative stories and narratives Capabilities The WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GGML model demonstrates impressive reasoning, storytelling, and general knowledge capabilities. It can engage in thoughtful discussions, provide detailed explanations, and generate captivating narratives across a broad range of topics. What can I use it for? The versatility of the WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GGML model makes it suitable for a wide variety of applications. It could be used for interactive storytelling experiences, virtual assistants, educational tools, creative writing aids, and more. Businesses could leverage the model's capabilities for content generation, customer service, and even product ideation. Things to try One interesting aspect of the WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GGML model is its ability to provide detailed, multi-paragraph responses to complex queries. Try prompting the model with questions that require in-depth explanations or analyses, and see how it responds. Additionally, explore the model's storytelling abilities by providing it with open-ended prompts or story starters and observe the creative narratives it generates.

Updated Invalid Date

Text-to-Text