Pygmalion-13B-SuperHOT-8K-GPTQ

Maintainer: TheBloke

Last updated 5/28/2024

🤷

Property	Value
Run this model	Run on HuggingFace
API spec	View on HuggingFace
Github link	No Github link provided
Paper link	No paper link provided

Create account to get full access

Model overview

The Pygmalion-13B-SuperHOT-8K-GPTQ model is a merge of TehVenom's Pygmalion 13B and Kaio Ken's SuperHOT 8K, quantized to 4-bit using GPTQ-for-LLaMa. It offers up to 8K context size, which has been tested to work with ExLlama and text-generation-webui.

Similar models include the Wizard-Vicuna-13B-Uncensored-SuperHOT-8K-GPTQ, which combines Eric Hartford's Wizard Vicuna 13B Uncensored with Kaio Ken's SuperHOT 8K, and the Llama-2-13B-GPTQ and Llama-2-7B-GPTQ models, which are GPTQ versions of Meta's Llama 2 models.

Model inputs and outputs

Inputs

The model accepts natural language text as input.

Outputs

The model generates natural language text as output.

Capabilities

The Pygmalion-13B-SuperHOT-8K-GPTQ model is capable of engaging in open-ended conversations and generating coherent and contextual text. Its extended 8K context size allows it to maintain continuity and coherence over longer passages of text.

What can I use it for?

This model could be used for a variety of natural language processing tasks, such as:

Open-ended chatbots and assistants: The model's capabilities make it well-suited for building conversational AI assistants that can engage in open-ended dialogue.
Content generation: The model could be used to generate text for creative writing, storytelling, and other content creation purposes.
Question answering and knowledge retrieval: With its large knowledge base, the model could be used to answer questions and retrieve information on a wide range of topics.

Things to try

One key aspect of this model is its ability to maintain coherence and context over longer passages of text due to the increased 8K context size. This could be particularly useful for applications that require a strong sense of narrative or conversational flow, such as interactive fiction, roleplaying, or virtual assistants.

Developers could explore ways to leverage this extended context to create more immersive and coherent experiences for users, such as by allowing the model to maintain character personalities, world-building details, and the progression of a storyline over longer interactions.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

🧠

WizardLM-13B-V1-1-SuperHOT-8K-GPTQ

TheBloke

WizardLM-13B-V1-1-SuperHOT-8K-GPTQ is a 13 billion parameter large language model that was created by merging WizardLM's WizardLM 13B V1.1 with Kaio Ken's SuperHOT 8K and then quantizing the model to 4-bit precision using GPTQ-for-LLaMa. This experimental model offers an increased context size of up to 8K tokens, which has been tested to work with the ExLlama library and text-generation-webui. Model inputs and outputs Inputs Prompts**: The model takes prompts as input, which can be in the form of natural language text, code, or a combination of the two. Outputs Text generation**: The primary output of the model is generated text, which can be used for a variety of tasks such as language modeling, summarization, translation, and creative writing. Capabilities The WizardLM-13B-V1-1-SuperHOT-8K-GPTQ model is capable of generating coherent and contextually relevant text across a wide range of topics. Its increased context size allows it to maintain coherence and consistency over longer stretches of text, making it particularly well-suited for tasks that require sustained reasoning or storytelling. What can I use it for? This model can be used for a variety of natural language processing tasks, such as: Creative writing**: The model's ability to generate coherent and contextually relevant text makes it useful for tasks like story writing, dialogue generation, and creative prompt completion. Task-oriented dialogue**: With its increased context size, the model can be used to build interactive conversational agents that can engage in multi-turn dialogues and maintain context over longer exchanges. Content generation**: The model can be used to generate text for a wide range of applications, such as blog posts, articles, product descriptions, and more. Things to try One interesting aspect of this model is its ability to leverage the extended 8K context size. By setting the appropriate parameters in tools like text-generation-webui, you can experiment with the model's performance on tasks that require maintaining coherence and consistency over longer stretches of text. Additionally, the model's quantization to 4-bit precision makes it more efficient and accessible for deployment on a variety of hardware platforms.

Updated Invalid Date

Text-to-Text

🔗

Pygmalion-2-13B-GPTQ

TheBloke

The Pygmalion-2-13B-GPTQ is a quantized version of the Pygmalion 2 13B language model created by PygmalionAI. It is a merge of Pygmalion-2 13B and Gryphe's MythoMax 13B model. According to the maintainer TheBloke, this model seems to outperform the original MythoMax in roleplaying and chat tasks. Similar quantized models available from TheBloke include the Mythalion-13B-GPTQ and the Llama-2-13B-GPTQ. These all provide different quantization options to optimize for performance on various hardware. Model inputs and outputs Inputs The model accepts text prompts as input, which can be formatted using the provided `, , and ` tokens. This allows injecting context, indicating user input, and specifying where the model should generate a response. Outputs The model generates text outputs in response to the provided prompts. It is designed to excel at roleplaying and creative writing tasks. Capabilities The Pygmalion-2-13B-GPTQ model is capable of generating coherent, contextual responses to prompts. It performs well on roleplaying and chat tasks, able to maintain a consistent persona and produce long-form responses. The model's capabilities make it suitable for applications like interactive fiction, creative writing assistants, and conversational AI agents. What can I use it for? The Pygmalion-2-13B-GPTQ model can be used for a variety of natural language generation tasks, with a particular focus on roleplaying and creative writing. Some potential use cases include: Interactive Fiction**: The model's ability to maintain character personas and generate contextual responses makes it well-suited for developing choose-your-own-adventure style interactive fiction experiences. Creative Writing Assistance**: The model can be used to assist human writers by generating text passages, suggesting plot ideas, or helping to develop characters and worlds. Conversational AI**: The model's chat-oriented capabilities can be leveraged to build more natural and engaging conversational AI agents for customer service, virtual assistants, or other interactive applications. Things to try One interesting aspect of the Pygmalion-2-13B-GPTQ model is its use of the provided `, , and ` tokens to structure prompts and conversations. Experimenting with different ways to leverage this format, such as defining custom personas or modes for the model to operate in, can unlock novel use cases and interactions. Additionally, trying out the various quantization options provided by TheBloke (e.g. 4-bit, 8-bit with different group sizes and Act Order settings) can help you find the best balance of performance and resource usage for your specific hardware and application requirements.

Updated Invalid Date

Text-to-Text

🐍

WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ

TheBloke

The WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ model is a 13B parameter language model created by combining Eric Hartford's WizardLM 13B V1.0 Uncensored with Kaio Ken's SuperHOT 8K. The model has been quantized to 4-bit using the GPTQ-for-LLaMa tool, which allows for increased context size up to 8K tokens. This model is an experimental new GPTQ that offers expanded context compared to the original WizardLM 13B V1.0 Uncensored. Model inputs and outputs The WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ model takes text prompts as input and generates coherent, detailed responses. The model has been trained on a large corpus of online text data, allowing it to understand and converse on a wide range of topics. Inputs Text prompt**: A text prompt provided to the model to initiate the generation of a response. Outputs Generated text**: The model's response to the provided text prompt, which can be up to 8192 tokens in length. Capabilities The WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ model is a powerful language model capable of engaging in open-ended conversations, answering questions, and generating human-like text on a variety of subjects. Its expanded context size allows it to maintain coherence and provide more detailed responses compared to models with shorter context. What can I use it for? The WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ model can be used for a wide range of natural language processing tasks, such as chatbots, content generation, question answering, and creative writing. The increased context size makes it well-suited for applications that require longer-form, coherent responses. Things to try One interesting aspect of the WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ model is its ability to maintain context and narrative structure over longer text generation. Try providing the model with a multi-sentence prompt and see how it continues the story or expands on the initial ideas. The model's large knowledge base and generation capabilities make it well-suited for collaborative storytelling or worldbuilding exercises.

Updated Invalid Date

Text-to-Text

🧠

WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ

TheBloke

The WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ is a large language model created by TheBloke, a prominent AI researcher and model developer. This model is a variant of the WizardLM-33B model, which has been merged with Kaio Ken's SuperHOT 8K system to extend the context length to 8192 tokens. The model has been quantized to 4-bit precision using GPTQ, resulting in a more compact and efficient model for inference on GPU hardware. Similar models available from TheBloke include the Wizard-Vicuna-13B-Uncensored-SuperHOT-8K-GPTQ, which is a 13B version of the model with a similar architecture and capabilities, and the WizardLM-7B-uncensored-GPTQ and WizardLM-30B-Uncensored-GPTQ models, which are smaller and larger variants respectively. Model inputs and outputs Inputs Text prompts**: The model accepts free-form text prompts as input, which can be used to generate continuations, completions, or responses. Outputs Generated text**: The model outputs generated text, which can be used for a variety of applications such as content creation, dialogue generation, and language modeling. Capabilities The WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ model demonstrates impressive language generation capabilities, with the ability to produce coherent and contextually relevant text. The extended 8192 token context length allows the model to maintain continuity and coherence over longer stretches of text, making it particularly well-suited for applications that require sustained dialogue or narrative generation. What can I use it for? This model can be used for a wide range of language-based applications, such as: Content creation**: The model can be used to generate articles, stories, scripts, or other forms of written content. Dialogue systems**: The extended context length makes this model well-suited for building more natural and contextual chatbots or virtual assistants. Summarization**: The model can be used to generate concise summaries of longer text passages. Question answering**: The model can be used to answer questions based on the provided context. Potential commercial applications for this model include creative content generation, customer service automation, and research and development in natural language processing. Things to try One interesting aspect of this model is its ability to maintain coherence and continuity over longer stretches of text, thanks to the extended 8192 token context length. You could try providing the model with a complex or multi-part prompt, and observe how it is able to build upon and expand the initial context to generate a cohesive and engaging response. Another interesting direction to explore would be fine-tuning or further training the model on specialized datasets, in order to adapt its capabilities to more specific use cases or domains. This could involve incorporating domain-specific knowledge or adjusting the model's tone, style, or behavior to better suit the intended application.

Updated Invalid Date

Text-to-Text