Phi-3-mini-128k-instruct

Maintainer: microsoft

Total Score

1.3K

Last updated 9/17/2024

🧠

PropertyValue
Run this modelRun on HuggingFace
API specView on HuggingFace
Github linkNo Github link provided
Paper linkNo paper link provided

Create account to get full access

or

If you already have an account, we'll log you in

Model overview

The phi-3-mini-128k-instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets. It is part of the Phi-3 family, which also includes the Phi-3-mini-4k-instruct and Phi-3-mini-128k-instruct models. The Phi-3 models are designed to be efficient and effective, with a focus on reasoning capabilities like code, math, and logic.

Model inputs and outputs

The phi-3-mini-128k-instruct model takes text as input and generates text in response. It is best suited for prompts using a chat format, where the user provides a prompt and the model generates a relevant response.

Inputs

  • Prompt: The text prompt to send to the model.
  • Max Length: The maximum number of tokens to generate.
  • Temperature: Adjusts the randomness of the outputs, with higher values being more random.
  • Top K: Samples from the top K most likely tokens when decoding text.
  • Top P: Samples from the top P percentage of most likely tokens when decoding text.
  • Repetition Penalty: Penalty for repeated words in the generated text.
  • System Prompt: The system prompt provided to the model.
  • Seed: The seed for the random number generator.

Outputs

  • Generated Text: The text generated by the model in response to the input prompt.

Capabilities

The phi-3-mini-128k-instruct model has demonstrated robust and state-of-the-art performance on a variety of benchmarks, including common sense reasoning, language understanding, mathematics, coding, and logical reasoning. It is designed to be effective in memory/compute-constrained environments and latency-bound scenarios, while providing strong reasoning capabilities.

What can I use it for?

The phi-3-mini-128k-instruct model is intended for commercial and research use in English. It can be used as a building block for generative AI-powered features, such as chatbots, language-generation tools, and code assistants. The model's small size and strong reasoning abilities make it particularly well-suited for use in applications that require efficient and effective language processing.

Things to try

One interesting aspect of the phi-3-mini-128k-instruct model is its ability to follow instructions and adhere to safety measures. You could try prompting the model with tasks that require following specific instructions or navigating complex scenarios, and see how it responds. Additionally, you could experiment with using the model in combination with other AI tools or datasets to explore new and innovative applications.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

🚀

Phi-3-mini-4k-instruct

microsoft

Total Score

603

The phi-3-mini-4k-instruct is a 3.8B parameter, lightweight, state-of-the-art open model trained with the Phi-3 datasets, as described by the maintainer. It is part of the Phi-3 family of models, which includes other variants like the phi-3-mini-128k-instruct and phi-3-mini-128k-instruct that differ in their context length. The Phi-3 models are designed to be high-performing yet memory/compute-constrained, making them suitable for latency-bound scenarios and environments with limited resources. Model inputs and outputs The phi-3-mini-4k-instruct model takes text as input and generates text as output. It is particularly well-suited for prompts using a chat format, where the input is structured as a conversation between a user and an assistant. Inputs Prompt**: The text that the model will use to generate a response. System Prompt**: An optional system prompt that helps guide the model's behavior, such as instructing it to act as a helpful assistant. Additional parameters**: The model also accepts various parameters to control the generation process, such as temperature, top-k and top-p filtering, and stopping sequences. Outputs Generated Text**: The model's response to the provided prompt, which can be a continuation of the conversation, an answer to a question, or a generated piece of text. Capabilities The phi-3-mini-4k-instruct model has been fine-tuned to excel at tasks that require strong reasoning abilities, such as common sense reasoning, language understanding, math, coding, and logical reasoning. When evaluated on a range of benchmarks, the model has demonstrated state-of-the-art performance among models with less than 13 billion parameters. What can I use it for? The phi-3-mini-4k-instruct model is intended for a variety of commercial and research use cases in English, particularly those that require memory or compute-constrained environments, such as mobile applications, or latency-bound scenarios. It can be used as a building block for developing generative AI-powered features, such as chatbots, question-answering systems, and code generation tools. Things to try One interesting aspect of the phi-3-mini-4k-instruct model is its ability to engage in multi-turn conversations using the provided chat format. You can try prompting the model with a series of related questions or tasks and observe how it maintains context and generates coherent responses. Additionally, the model's strong performance on tasks like math and coding make it a compelling choice for developing educational or productivity-focused applications.

Read more

Updated Invalid Date

🛠️

Phi-3-small-128k-instruct

microsoft

Total Score

116

The Phi-3-small-128k-instruct is a 7B parameter, lightweight, state-of-the-art open model trained by Microsoft. It belongs to the Phi-3 family of models, which includes variants with different context lengths such as the Phi-3-small-8k-instruct and Phi-3-mini-128k-instruct. The model was trained on a combination of synthetic data and filtered publicly available websites, with a focus on high-quality and reasoning-dense properties. After initial training, the model underwent a post-training process that incorporated both supervised fine-tuning and direct preference optimization to enhance its ability to follow instructions and adhere to safety measures. When evaluated against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, the Phi-3-small-128k-instruct demonstrated robust and state-of-the-art performance among models of the same size and next size up. Model inputs and outputs Inputs Text**: The Phi-3-small-128k-instruct model is best suited for prompts using the chat format, where the input is provided as text. Outputs Generated text**: The model generates text in response to the input prompt. Capabilities The Phi-3-small-128k-instruct model showcases strong reasoning abilities, particularly in areas like code, math, and logic. It performs well on benchmarks evaluating common sense, language understanding, and logical reasoning. The model is also designed to be lightweight and efficient, making it suitable for memory/compute-constrained environments and latency-bound scenarios. What can I use it for? The Phi-3-small-128k-instruct model is intended for broad commercial and research use in English. It can be used as a building block for general-purpose AI systems and applications that require strong reasoning capabilities, such as: Memory/compute-constrained environments Latency-bound scenarios AI systems that need to excel at tasks like coding, math, and logical reasoning Microsoft has also released other models in the Phi-3 family, such as the Phi-3-mini-128k-instruct and Phi-3-medium-128k-instruct, which may be better suited for different use cases based on their size and capabilities. Things to try One interesting aspect of the Phi-3-small-128k-instruct model is its strong performance on benchmarks evaluating logical reasoning and math skills. Developers could explore using this model as a foundation for building AI systems that need to tackle complex logical or mathematical problems, such as automated theorem proving, symbolic reasoning, or advanced question-answering. Another area to explore is the model's ability to follow instructions and adhere to safety guidelines. Developers could investigate how the model's instruction-following and safety-conscious capabilities could be leveraged in applications that require reliable and trustworthy AI assistants, such as in customer service, education, or sensitive domains.

Read more

Updated Invalid Date

🔍

Phi-3-small-8k-instruct

microsoft

Total Score

108

The Phi-3-small-8k-instruct is a 7B parameter, lightweight, state-of-the-art open model from Microsoft. It is part of the Phi-3 family of models, which includes variants with different context lengths - 8K and 128K. The Phi-3 models are trained on a combination of synthetic data and filtered public websites, with a focus on high-quality and reasoning-dense properties. The Phi-3-small-8k-instruct model has undergone a post-training process that incorporates both supervised fine-tuning and direct preference optimization to enhance its ability to follow instructions and adhere to safety measures. When evaluated on benchmarks testing common sense, language understanding, math, code, long context, and logical reasoning, the model demonstrated robust and state-of-the-art performance among models of similar size. Model inputs and outputs Inputs Text prompts, best suited for the chat format Outputs Generated text responses to the input prompts Capabilities The Phi-3-small-8k-instruct model excels at tasks that require strong reasoning, such as math, coding, and logical analysis. It can provide detailed and coherent responses across a wide range of topics. What can I use it for? The Phi-3-small-8k-instruct model is intended for broad commercial and research use in English. It can be used in general-purpose AI systems and applications that require memory/compute constrained environments, low-latency scenarios, or robust reasoning capabilities. The model can accelerate research on language and multimodal models, and serve as a building block for generative AI-powered features. Things to try One interesting aspect of the Phi-3-small-8k-instruct model is its ability to provide step-by-step explanations and solutions for math and coding problems. You can try prompting the model with math equations or coding challenges and observe how it breaks down the problem and walks through the solution. Another interesting area to explore is the model's language understanding and common sense reasoning capabilities. You can provide it with prompts that require an understanding of the physical world, social norms, or abstract concepts, and see how it responds.

Read more

Updated Invalid Date

🤿

Phi-3-medium-128k-instruct

microsoft

Total Score

295

The Phi-3-medium-128k-instruct is a 14B parameter, lightweight, state-of-the-art open model developed by Microsoft. It was trained on synthetic data and filtered publicly available websites, with a focus on high-quality and reasoning-dense properties. The model belongs to the Phi-3 family, which also includes Phi-3-mini-128k-instruct and Phi-3-mini-4k-instruct, differing in parameter size and context length. The model underwent a post-training process that incorporated supervised fine-tuning and direct preference optimization to enhance its instruction following and safety. When evaluated on benchmarks testing common sense, language understanding, math, code, long context, and logical reasoning, the Phi-3-medium-128k-instruct demonstrated robust and state-of-the-art performance among models of similar and larger sizes. Model inputs and outputs Inputs Text**: The Phi-3-medium-128k-instruct model is best suited for text-based prompts, particularly those using a chat format. Outputs Generated text**: The model generates relevant and coherent text in response to the input prompt. Capabilities The Phi-3-medium-128k-instruct model showcases strong reasoning abilities across a variety of domains, including common sense, language understanding, mathematics, coding, and logical reasoning. For example, it can provide step-by-step solutions to math problems, generate code to implement algorithms, and engage in multi-turn conversations to demonstrate its understanding of complex topics. What can I use it for? The Phi-3-medium-128k-instruct model is intended for broad commercial and research use cases that require memory/compute-constrained environments, latency-bound scenarios, and strong reasoning capabilities. It can be used as a building block for developing generative AI-powered features, such as question-answering systems, code generation tools, and educational applications. Things to try One interesting aspect of the Phi-3-medium-128k-instruct model is its ability to handle long-form context. Try providing the model with a multi-paragraph prompt and see how it maintains coherence and relevance in its generated response. You can also experiment with using the model for specific tasks, such as translating technical jargon into plain language or generating step-by-step explanations for complex concepts.

Read more

Updated Invalid Date