Nousresearch

Models by this creator

🔍

Yarn-Mistral-7b-128k

NousResearch

Total Score

566

The Yarn-Mistral-7b-128k is a state-of-the-art language model for long context, further pretrained on long context data for 1500 steps using the YaRN extension method. It is an extension of the Mistral-7B-v0.1 model and supports a 128k token context window. The model was created by NousResearch and demonstrates strong performance on long context benchmarks. Model inputs and outputs The Yarn-Mistral-7b-128k model takes text as input and generates text as output. It can be used for a variety of language tasks such as text generation, summarization, and question answering. Inputs Text prompts Outputs Generated text Capabilities The Yarn-Mistral-7b-128k model excels at tasks requiring long-range context, such as summarizing long documents or generating coherent multi-paragraph text. It maintains good performance even when the context window is extended to 128k tokens, outperforming the original Mistral-7B-v0.1 model. What can I use it for? The Yarn-Mistral-7b-128k model can be used for a variety of natural language processing tasks, such as text generation, summarization, and question answering. Its long context capabilities make it well-suited for applications that require understanding and generating long-form text, such as creative writing, technical documentation, or research summarization. Things to try One interesting thing to try with the Yarn-Mistral-7b-128k model is to provide it with a lengthy prompt or context and see how it is able to generate coherent and relevant text. The model's ability to maintain context over a 128k token window allows it to produce more consistent and informative outputs compared to models with shorter context windows.

Read more

Updated 5/28/2024

🚀

Hermes-2-Pro-Mistral-7B

NousResearch

Total Score

464

The Hermes-2-Pro-Mistral-7B is an upgraded and retrained version of the Nous Hermes 2 model. It was developed by NousResearch and includes an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset. This new version of Hermes maintains its excellent general task and conversation capabilities, but also excels at Function Calling, JSON Structured Outputs, and has improved on several other metrics. The Hermes-2-Pro-Mistral-7B model takes advantage of a special system prompt and multi-turn function calling structure with a new chatml role to make function calling reliable and easy to parse. It was developed in collaboration with interstellarninja and Fireworks.AI. Model inputs and outputs Inputs Natural language instructions and prompts Outputs Natural language responses Structured JSON outputs Reliable function calls Capabilities The Hermes-2-Pro-Mistral-7B model has excellent general task and conversation capabilities, and also excels at function calling and producing structured JSON outputs. It scored 90% on a function calling evaluation and 84% on a structured JSON Output evaluation. What can I use it for? The Hermes-2-Pro-Mistral-7B model can be used for a variety of tasks, including general language understanding and generation, task completion, and structured data output. Its strong performance on function calling and JSON output makes it well-suited for applications that require reliable and interpretable machine-generated responses, such as chatbots, virtual assistants, and data processing pipelines. Things to try One interesting thing to try with the Hermes-2-Pro-Mistral-7B model is exploring its capabilities around function calling and structured JSON output. The model's specialized prompt and multi-turn format for these tasks could enable novel applications that combine natural language interaction with reliable programmatic control and data manipulation.

Read more

Updated 5/27/2024

⛏️

Nous-Hermes-13b

NousResearch

Total Score

426

Nous-Hermes-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by NousResearch, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors. The result is an enhanced Llama 13b model that rivals GPT-3.5-turbo in performance across a variety of tasks. This model stands out for its long responses, low hallucination rate, and absence of OpenAI censorship mechanisms. Similar models include Nous-Hermes-13B-GPTQ, nous-hermes-2-yi-34b-gguf, OpenHermes-2.5-Mistral-7B, and Hermes-2-Pro-Mistral-7B. Model Inputs and Outputs Nous-Hermes-13b is a text-to-text model, taking natural language prompts as input and generating coherent, informative responses. The model was fine-tuned on a diverse dataset of over 300,000 instructions, spanning topics like general conversation, coding, roleplaying, and more. Inputs Natural language prompts or instructions Outputs Detailed, coherent text responses to the provided prompts Capabilities Nous-Hermes-13b excels at a variety of language tasks, from open-ended conversation to following complex instructions. It can engage in substantive discussions on topics like science, philosophy, and current events, and also perform well on tasks like code generation, question answering, and creative writing. The model's long-form responses and low hallucination rate make it a powerful tool for applications that require reliable, trustworthy language generation. What Can I Use It For? Nous-Hermes-13b could be used in a wide range of applications that require advanced language understanding and generation, such as: Conversational AI assistants Automated content generation (e.g. articles, stories, scripts) Educational and instructional materials Code generation and programming assistance Roleplaying and interactive fiction Given the model's strong performance on a variety of benchmarks, it could also serve as a valuable base model for further fine-tuning and customization to meet specific domain or task requirements. Things to Try One interesting aspect of Nous-Hermes-13b is its ability to engage in substantive, multi-turn conversations. Try providing the model with a thought-provoking prompt or open-ended question and see how it responds and elaborates over the course of the interaction. The model's coherence and depth of insight can make for engaging and enlightening exchanges. Another interesting avenue to explore is the model's capability for creative writing and storytelling. Provide it with a starting prompt or character and see how it develops a narrative, including introducing plot twists, vivid descriptions, and compelling dialogue. Overall, Nous-Hermes-13b is a powerful language model that can be leveraged in a wide variety of applications. Its combination of strong performance, long-form generation, and lack of censorship mechanisms make it a valuable tool for those seeking advanced, customizable language AI.

Read more

Updated 5/28/2024

🖼️

Nous-Hermes-2-Mixtral-8x7B-DPO

NousResearch

Total Score

372

Nous-Hermes-2-Mixtral-8x7B-DPO is the new flagship Nous Research model trained over the Mixtral 8x7B MoE LLM. The model was trained on over 1,000,000 entries of primarily GPT-4 generated data, as well as other high quality data from open datasets across the AI landscape, achieving state of the art performance on a variety of tasks. This is the SFT + DPO version of Mixtral Hermes 2, with an SFT only version also available. The model was developed in collaboration with Together.ai, who sponsored the compute for the many experiments. Similar models include the Hermes-2-Pro-Mistral-7B and the Nous-Hermes-13B which have their own unique capabilities and use cases. Model inputs and outputs Inputs Natural language prompts for text generation Content for tasks like code generation, summarization, and open-ended conversation Outputs Generated text in response to prompts Structured outputs like JSON for tasks like API interaction Responses to open-ended questions and conversation Capabilities The Nous-Hermes-2-Mixtral-8x7B-DPO model has shown strong performance on a variety of benchmarks, including GPT4All, AGIEval, and BigBench. It demonstrates robust text generation capabilities, as showcased by examples like writing code for data visualization, generating cyberpunk poems, and performing backtranslation. The model also excels at function calling and structured JSON output. What can I use it for? The versatile capabilities of Nous-Hermes-2-Mixtral-8x7B-DPO make it useful for a wide range of applications. Some potential use cases include: Automated content generation (articles, stories, poems, etc.) Code generation and AI-assisted programming Conversational AI assistants for customer service or education Data analysis and visualization Specialized task completion via structured outputs (e.g. APIs, JSON) Things to try One interesting thing to explore with Nous-Hermes-2-Mixtral-8x7B-DPO is its ability to engage in multi-turn conversations using the ChatML prompt format. By leveraging system prompts and roles, you can guide the model's responses and prompt it to take on different personas or styles of interaction. This can unlodge novel and creative outputs. Another avenue to investigate is the model's performance on specialized tasks like function calling and JSON output generation. The maintainers have released evaluation datasets and code to test these capabilities, which could inspire new applications and integrations.

Read more

Updated 5/28/2024

🤔

Genstruct-7B

NousResearch

Total Score

353

Genstruct-7B is an instruction-generation model designed by NousResearch. It is trained to create valid instructions given a raw text corpus, enabling the creation of new, partially synthetic instruction finetuning datasets. This work was inspired by Ada-Instruct, which trained a custom instruction-generation model, whereas previous methods largely relied on in-context approaches. Genstruct-7B takes this approach further by grounding the generations in user-provided context passages. It is trained to generate questions involving complex scenarios that require detailed reasoning, allowing for models trained on the generated data to reason step-by-step. This contrasts with models like ChatGPT and RAG which use few-shot prompting or retrieve information from an external knowledge base. Model inputs and outputs Inputs Context passages**: Text provided by the user that grounds the instruction generations Outputs Instructions**: Novel instructions generated based on the input context passages, involving complex reasoning and scenarios Capabilities Genstruct-7B can be used to create rich, contextual instruction datasets for training downstream models. By generating instructions that require step-by-step reasoning, it enables the development of models with stronger general language understanding and problem-solving abilities. This contrasts with models trained on more simplistic or templated instructions. What can I use it for? The Genstruct-7B model could be used as a tool to quickly generate diverse datasets for training new AI models, across a wide range of domains and applications. For example, you could use it to create instruction datasets for task-oriented dialog, procedural text generation, or educational applications that require complex reasoning. Things to try One interesting thing to try with Genstruct-7B would be to experiment with the level of complexity and reasoning required in the generated instructions. By adjusting the input context passages, you could explore how this impacts the downstream model's capabilities and performance on benchmarks like HellaSwag, PIQA, and GSM8K. This could yield insights into the types of instruction-based datasets that are most effective for training robust language models.

Read more

Updated 5/28/2024

🌐

Hermes-2-Pro-Llama-3-8B

NousResearch

Total Score

351

The Hermes-2-Pro-Llama-3-8B model is an upgraded, retrained version of the original Nous Hermes 2 model. It was developed by NousResearch and consists of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset. Compared to the original Hermes 2, this new version maintains excellent general task and conversation capabilities, while also excelling at Function Calling, JSON Structured Outputs, and other key metrics. The Hermes-2-Pro-Mistral-7B and Hermes-2-Pro-Mistral-7B-GGUF models are similar, also developed by NousResearch. The 7B version uses the Mistral architecture, while the Llama-3 8B version uses the Llama architecture. Both models leverage the same dataset and fine-tuning approach to provide powerful language understanding and generation capabilities. Model inputs and outputs Inputs Text prompts**: The model accepts natural language text prompts as input, which can include instructions, questions, or conversational dialogue. Function call inputs**: The model can also accept structured function call inputs, where the user specifies the function name and arguments to be executed. JSON schema**: For structured output mode, the model expects the user to provide a JSON schema that defines the desired output format. Outputs Natural language responses**: The model generates coherent, contextually relevant natural language responses to the provided prompts. Structured function call outputs**: When provided with a function call, the model will output the result of executing that function, formatted as a JSON object. Structured JSON outputs**: When prompted with a JSON schema, the model will generate a JSON object that adheres to the specified structure. Capabilities The Hermes-2-Pro-Llama-3-8B model excels at a wide range of language tasks, including general conversation, task completion, and structured data processing. It has been evaluated to have 91% accuracy on function calling tasks and 84% accuracy on JSON structured output tasks, demonstrating its strong capabilities in these areas. Some key capabilities of the model include: Engaging in natural language conversations and providing helpful, informative responses Executing specific functions or tasks based on provided inputs and returning the results in a structured format Generating JSON outputs that adhere to a predefined schema, enabling integration with downstream applications that require structured data What can I use it for? The Hermes-2-Pro-Llama-3-8B model could be useful for a variety of applications that require advanced language understanding and generation, such as: Conversational assistants**: The model's strong conversational abilities make it well-suited for building chatbots, virtual assistants, and other interactive applications. Task automation**: The model's function calling capabilities allow it to be integrated into workflows that require the execution of specific tasks or the generation of structured data outputs. Data processing and transformation**: The model's structured output generation capabilities can be leveraged to convert unstructured text into formatted data, facilitating integration with other systems and applications. Things to try One interesting aspect of the Hermes-2-Pro-Llama-3-8B model is its ability to handle multi-turn function calling interactions. By using the provided system prompt and structured input format, users can engage the model in a back-and-forth dialogue, where the model executes functions, returns the results, and the user can then provide additional input or instructions. Another compelling feature is the model's structured JSON output generation. By defining a specific JSON schema, users can prompt the model to generate outputs that adhere to a predefined structure, enabling seamless integration with other systems and applications that require structured data. Overall, the Hermes-2-Pro-Llama-3-8B model offers a powerful combination of natural language understanding, task execution, and structured data generation capabilities, making it a versatile tool for a wide range of language-based applications.

Read more

Updated 6/1/2024

🌿

Nous-Hermes-2-Vision-Alpha

NousResearch

Total Score

299

Nous-Hermes-2-Vision stands as a pioneering Vision-Language Model, leveraging advancements from the renowned OpenHermes-2.5-Mistral-7B by teknium. This model incorporates two pivotal enhancements, setting it apart as a cutting-edge solution. It harnesses the formidable SigLIP-400M, a more lightweight vision encoder that delivers a remarkable boost in performance. Additionally, the training data includes a unique feature of function calling, transforming Nous-Hermes-2-Vision into a Vision-Language Action Model. Model inputs and outputs Nous-Hermes-2-Vision is a multimodal model that takes both image and text inputs, and generates text outputs. The model can be used for a variety of tasks, including image-to-text generation, image-based question answering, and vision-language instruction following. Inputs Images**: The model can accept various image formats, such as JPG, PNG, or WebP, as input. Text**: The model can accept text prompts or instructions as input, which can be used to guide the generation or processing of the input image. Outputs Text**: The model generates textual output, such as captions, descriptions, or responses to questions about the input image. Capabilities Nous-Hermes-2-Vision excels at tasks that require understanding and reasoning about visual information in conjunction with language. For example, the model can be used to generate detailed captions for images, answer questions about the content of an image, or follow instructions for performing actions based on the visual input. What can I use it for? With its versatile capabilities, Nous-Hermes-2-Vision can be applied to a wide range of projects and use cases. Some potential applications include: Image captioning**: Generate natural language captions for images to assist with accessibility, search, or content organization. Visual question answering**: Answer questions about the content of an image, such as identifying objects, people, or activities. Visual instruction following**: Use the model to understand and follow step-by-step visual instructions, such as for assembling products or completing tasks. Multimodal content generation**: Combine visual and textual inputs to create compelling, contextual content for creative applications or marketing purposes. Things to try One interesting aspect of Nous-Hermes-2-Vision is its ability to leverage function calling to enhance its capabilities. By incorporating a custom dataset with function calling, the model can be used to perform specific actions or computations based on the input image and text. For example, you could provide the model with an image of a stock chart and a prompt to "Analyze the stock fundamentals for this company," and the model would generate a detailed response with the relevant financial data. This function calling capability sets Nous-Hermes-2-Vision apart from traditional vision-language models and opens up a wide range of possibilities for integrating the model into automated workflows or decision-support systems.

Read more

Updated 5/28/2024

🏷️

Nous-Hermes-Llama2-13b

NousResearch

Total Score

299

Nous-Hermes-Llama2-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions by Nous Research. The model was trained on a diverse dataset including synthetic GPT-4 outputs, the GPTeacher dataset, and other high-quality datasets. Similar models include the Nous-Hermes-13b and Nous-Hermes-2-Mixtral-8x7B-DPO, which were also developed by Nous Research. Model inputs and outputs Nous-Hermes-Llama2-13b is a text-to-text model, meaning it takes text as input and generates new text as output. The model is capable of engaging in open-ended conversations, following instructions, and completing a variety of language tasks. Inputs Free-form text in natural language Outputs Generated text in natural language, which can range from short responses to long-form content Capabilities The model stands out for its long responses, lower hallucination rate, and absence of OpenAI censorship mechanisms. It has demonstrated strong performance on a variety of benchmarks, including GPT4All, AGIEval, and BigBench. What can I use it for? Nous-Hermes-Llama2-13b can be used for a wide range of language tasks, from creative writing to task completion. It could be particularly useful for applications that require long-form content generation, such as writing articles, stories, or reports. The model's strong performance on instruction following also makes it well-suited for use cases like virtual assistants, chatbots, and productivity tools. Things to try One interesting aspect of Nous-Hermes-Llama2-13b is its ability to engage in open-ended conversations and provide detailed, thoughtful responses. You could try prompting the model with complex questions or philosophical prompts to see how it responds. Additionally, the model's low hallucination rate and lack of censorship mechanisms could make it useful for research or exploration into the nature of language models and their capabilities.

Read more

Updated 5/27/2024

⚙️

Nous-Hermes-2-Yi-34B

NousResearch

Total Score

232

Nous-Hermes-2-Yi-34B is a state-of-the-art Yi Fine-tune developed by NousResearch. It was trained on 1,000,000 entries of primarily GPT-4 generated data, as well as other high quality data from open datasets across the AI landscape. This model outperforms previous Nous-Hermes and Open-Hermes models, achieving new heights in benchmarks like GPT4All, AGIEval, and BigBench. It surpasses many popular finetuned models as well. Model inputs and outputs Inputs Text prompts**: The model accepts text prompts as input, which can be used to generate a wide variety of text outputs. Outputs Generated text**: The model can generate coherent, contextually relevant text in response to the provided input prompts. This includes discussions about complex topics like gravity, code generation, and more. Capabilities The Nous-Hermes-2-Yi-34B model demonstrates impressive capabilities across a range of tasks. It can engage in substantive discussions about scientific concepts, generate functional code snippets, and even roleplay as fictional characters. The model's strong performance on benchmarks like GPT4All, AGIEval, and BigBench indicates its broad competence. What can I use it for? The Nous-Hermes-2-Yi-34B model could be useful for a variety of applications that require advanced natural language processing and generation, such as: Chatbots and virtual assistants Content generation for blogs, articles, or social media Code generation and programming assistance Research and experimentation in the field of artificial intelligence Things to try One interesting aspect of the Nous-Hermes-2-Yi-34B model is its ability to engage in multi-turn dialogues and follow complex instructions, as demonstrated in the examples provided. Users could experiment with prompts that involve longer-form interactions or task completion to further explore the model's capabilities.

Read more

Updated 5/28/2024

🏷️

Nous-Capybara-34B

NousResearch

Total Score

230

The Nous-Capybara-34B V1.9 is the first 34B Nous model and the first 200K context length Nous model, trained by Nous Research. It was fine-tuned on the Capybara dataset, which leverages Nous' novel "Amplify-Instruct" data synthesis technique. This technique combines top-performing data synthesis methods like Airoboros, Evol-Instruct (WizardLM), Orca, Vicuna, Know_Logic, Lamini, and FLASK, along with seed instructions from datasets like Airoboros, Know Logic, EverythingLM, GPTeacher, and LessWrong. The current Capybara dataset contains 20K training examples, which is 10 times smaller than many similar performing models. This has significant scaling implications for Nous' future generations of models. The model was fine-tuned by Nous Research as part of the Capybara/Amplify-Instruct project led by Luigi D. (LDJ), with significant dataset formation contributions from J-Supha and general compute and experimentation management by Jeffrey Q. The training was sponsored by A16Z and Yield Protocol. Model inputs and outputs The Nous-Capybara-34B is a text-to-text AI model that can take in a wide range of textual inputs and generate relevant responses. The model is trained on a large corpus of diverse data, enabling it to handle a variety of tasks and queries. Inputs Freeform text prompts or queries Conversational exchanges Instructions or requests for information, analysis, or task completion Outputs Relevant and coherent textual responses Informative and well-reasoned answers to questions Detailed plans or step-by-step instructions for completing tasks Creative and engaging text generation Capabilities The Nous-Capybara-34B model is capable of tackling a wide range of language tasks, from natural language understanding and generation to following complex instructions and completing multi-step tasks. It can engage in substantive conversations, provide detailed explanations and analyses, and generate creative and coherent text. One key capability of the model is its long-form response generation, which allows it to produce detailed and nuanced outputs. It also exhibits a low hallucination rate, meaning it is less prone to generating factually incorrect information. Additionally, the model is not subject to the censorship mechanisms found in some other large language models. What can I use it for? The Nous-Capybara-34B model is a versatile tool that can be applied to a variety of projects and use cases. Some potential applications include: Building advanced chatbots and virtual assistants to handle complex queries and tasks Automating content generation for blogs, articles, or other written materials Enhancing language understanding and generation capabilities in various software applications Powering research and analysis tools that require in-depth textual processing and generation For example, you could use the Nous-Capybara-34B model to build a virtual assistant that can engage in detailed conversations, provide step-by-step instructions for completing tasks, and generate creative and informative text. This could be useful for customer service, educational, or research applications. Things to try One interesting aspect of the Nous-Capybara-34B model is its ability to generate long, coherent responses. You could experiment with prompting the model to elaborate on a specific topic or provide a detailed analysis of a complex issue. This could help you uncover the model's depth of knowledge and its capacity for nuanced and thoughtful discourse. Another area to explore is the model's performance on multi-step tasks or instructions. You could provide the model with a set of requirements or a problem to solve, and see how it breaks down the problem and outlines a comprehensive solution. This could be particularly useful for applications that require task planning and execution. Overall, the Nous-Capybara-34B model represents an exciting advancement in large language model technology, with the potential to enable a wide range of innovative applications and use cases.

Read more

Updated 5/28/2024