Mlx-community

Models by this creator

🛠️

Llama-2-7b-chat-mlx

Llama-2-7b-chat-mlx is a 7 billion parameter version of Meta's Llama 2 family of large language models. It has been fine-tuned for dialogue use cases and converted to the MLX format for use in Apple's ML framework. The Llama-2-7b-chat-hf and Llama-2-7b-chat models are similar dialogue-focused versions of Llama 2, but optimized for the Hugging Face Transformers library. Model Inputs and Outputs The Llama-2-7b-chat-mlx model takes in text input and generates text output. It is designed for conversational dialogue, with specific formatting requirements to achieve the expected chat-like performance, including the use of INST and > tags, BOS and EOS tokens, and whitespaces/line breaks. Inputs Text input to be used as a prompt for the model Outputs Generated text continuing the dialogue Capabilities The Llama-2-7b-chat-mlx model is capable of engaging in open-ended dialogue, answering questions, and generating human-like responses. It has been fine-tuned to perform well on tasks like commonsense reasoning, world knowledge, and reading comprehension. Compared to open-source chat models, it generally outperforms on benchmarks and in human evaluations for helpfulness and safety. What Can I Use It For? The Llama-2-7b-chat-mlx model is well-suited for building conversational AI assistants, chatbots, and other applications that require natural language generation and understanding. It could be used to power customer service interactions, task-oriented dialogues, or creative writing assistants. As noted in the maintainer's description, the model is intended for commercial and research use in English, and developers should perform safety testing and tuning tailored to their specific applications before deployment. Things to Try One interesting aspect of the Llama 2 models is their range of scaling - from 7 billion to 70 billion parameters. You could experiment with using different size versions of the model to see how the performance and capabilities scale. The Llama Model Index provides direct links to access these other Llama 2 variations.

Updated 5/27/2024

Text-to-Text

🌿

Meta-Llama-3-8B-Instruct-4bit

mlx-community

The mlx-community/Meta-Llama-3-8B-Instruct-4bit model is a quantized version of the meta-llama/Meta-Llama-3-8B-Instruct model. The original model was developed and released by Meta as part of the Llama 3 family of large language models (LLMs). Llama 3 models are optimized for dialogue use cases and outperform many open-source chat models on common industry benchmarks. The Llama 3 models use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align the models with human preferences for helpfulness and safety. The 8B parameter size version of the Llama 3 model is well-suited for applications that require a smaller, faster model. It maintains strong performance across a variety of tasks while being more efficient than the larger 70B parameter version. The mlx-community/Meta-Llama-3-8B-Instruct-4bit model further optimizes the 8B model by quantizing it to 4-bit precision, reducing the model size and inference time while preserving much of the original model's capabilities. Model inputs and outputs Inputs Text data: The model takes text as input and generates text in response. Outputs Text generation: The model outputs generated text, which can be used for a variety of natural language processing tasks such as chatbots, content creation, and question answering. Capabilities The mlx-community/Meta-Llama-3-8B-Instruct-4bit model is capable of a wide range of text-to-text tasks. It can engage in open-ended dialogue, answer questions, summarize text, and even generate creative content like stories and poems. The model has been trained on a diverse dataset and can draw upon broad knowledge to provide informative and coherent responses. What can I use it for? The mlx-community/Meta-Llama-3-8B-Instruct-4bit model can be useful for a variety of applications, including: Chatbots and virtual assistants: The model's conversational abilities make it well-suited for building chatbots and virtual assistants that can engage in natural dialogue. Content creation: The model can be used to generate text for blog posts, articles, scripts, and other creative writing projects. Question answering: The model can be used to build systems that can answer questions on a wide range of topics. Summarization: The model can be used to generate concise summaries of longer text passages. Things to try One interesting aspect of the mlx-community/Meta-Llama-3-8B-Instruct-4bit model is its ability to follow instructions and adapt its output to the specified context. By providing a clear system prompt, you can get the model to respond in different personas or styles, such as a pirate chatbot or a creative writing assistant. Experimenting with different system prompts can unlock new capabilities and use cases for the model. Another interesting area to explore is the model's performance on specialized tasks or domains. While the model has been trained on a broad dataset, it may be possible to further fine-tune it on domain-specific data to enhance its capabilities in areas like technical writing, legal analysis, or scientific research.

Updated 6/4/2024

Text-to-Text

🔮

Mixtral-8x22B-4bit

mlx-community

The Mixtral-8x22B-4bit is a large language model (LLM) developed by the mlx-community team. It was converted from the original Mixtral-8x22B-v0.1 model created by v2ray using the mlx-lm library. The model is a pre-trained generative Sparse Mixture of Experts (SMoE) with around 176 billion parameters, of which 44 billion are active during inference. It has a 65,000 token context window and a 32,000 vocabulary size. Similar models include the Meta-Llama-3-8B-Instruct-4bit and the Mixtral-8x22B-v0.1 models, both of which share some architectural similarities with the Mixtral-8x22B-4bit. Model inputs and outputs Inputs Text prompts of varying lengths, typically a few sentences or a short paragraph. Outputs Continuation of the input text, generating new tokens to extend the prompt in a coherent and contextually relevant manner. Capabilities The Mixtral-8x22B-4bit model is capable of generating fluent and contextually appropriate text across a wide range of domains, including creative writing, question answering, summarization, and general language understanding tasks. It can be fine-tuned for specific applications or used as a base model for further customization. What can I use it for? The Mixtral-8x22B-4bit model can be a powerful tool for a variety of natural language processing applications, such as: Content generation: Producing engaging, human-like text for creative writing, journalism, marketing, and other use cases. Question answering: Responding to user queries with relevant and informative answers. Summarization: Condensing long-form text into concise, informative summaries. Dialogue systems: Powering conversational interfaces for chatbots, virtual assistants, and other interactive applications. Things to try One interesting aspect of the Mixtral-8x22B-4bit model is its ability to generate diverse and creative text outputs. Try providing the model with open-ended prompts or creative writing exercises and see how it responds. You can also experiment with fine-tuning the model on specific datasets or tasks to adapt it to your particular needs.

Updated 6/17/2024

Text-to-Text

🌀

phi-2

mlx-community

The phi-2 model is a Transformer with 2.7 billion parameters, developed by the mlx-community. It was trained using the same data sources as the Phi-1.5 model, with an additional new data source consisting of various NLP synthetic texts and filtered websites. When assessed against benchmarks testing common sense, language understanding, and logical reasoning, Phi-2 showcased nearly state-of-the-art performance among models with less than 13 billion parameters. Unlike models fine-tuned through reinforcement learning from human feedback, Phi-2 has not undergone this process. The goal in creating this open-source model is to provide the research community with a non-restricted small model to explore vital safety challenges, such as reducing toxicity, understanding societal biases, and enhancing controllability. Model Inputs and Outputs The phi-2 model can accept text inputs and generate text outputs. It is particularly well-suited for prompts using the QA format, the chat format, and the code format. Inputs Text**: The model can accept various types of text inputs, such as questions, instructions, or prompts. Outputs Text**: The model generates fluent text responses based on the provided input. Capabilities The phi-2 model has demonstrated strong performance on benchmarks testing common sense, language understanding, and logical reasoning. It can be used to generate high-quality text in a variety of formats, including question-answering, chatbot conversations, and code generation. What Can I Use It For? The phi-2 model is intended for research purposes only. It can be a useful tool for exploring safety challenges in language models, such as reducing toxicity and understanding societal biases. Researchers can use the model to investigate ways to enhance controllability and align large language models with human preferences. Things to Try One interesting thing to try with the phi-2 model is to experiment with different input formats and prompts to see how it responds. For example, you could try providing the model with a QA-style prompt, a chat-style prompt, and a code generation prompt to compare its performance across different use cases. Another idea is to explore the model's capabilities in generating text that is aligned with human values and preferences, and to investigate ways to further enhance its safety and controllability. The open-source nature of the phi-2 model makes it a valuable resource for the research community to advance the field of safe and responsible AI development.

Updated 5/28/2024

Text-to-Text

🤷

dbrx-instruct-4bit

mlx-community

The dbrx-instruct-4bit model is a text-to-text AI model created by the mlx-community. It was converted from the original databricks/dbrx-instruct model using the mlx-lm tool. This model is a Mixture-of-Experts (MoE) large language model trained by Databricks, and is an instruction-following variant of their base dbrx-base model. Compared to similar MoE models like Meta-Llama-3-8B-Instruct-4bit and Mixtral-8x22B-4bit, the dbrx-instruct-4bit model uses a fine-grained MoE architecture with more, smaller experts to improve quality. Model inputs and outputs The dbrx-instruct-4bit model is a text-to-text model, meaning it takes text-based inputs and produces text-based outputs. It can accept context lengths up to 32,768 tokens. Inputs Text-based prompts and instructions Outputs Text-based responses and completions Capabilities The dbrx-instruct-4bit model has been fine-tuned on a large, diverse dataset to specialize in few-turn interactions and instruction-following tasks. It demonstrates strong performance on a wide range of language understanding, reasoning, and problem-solving benchmarks. What can I use it for? The dbrx-instruct-4bit model is a general-purpose, open-source language model that can be used for a variety of natural language processing tasks. Some potential use cases include: Building conversational AI assistants that can follow instructions and engage in task-oriented dialogs Generating human-like text for creative writing, content creation, or dialogue systems Providing question-answering capabilities for research or educational applications Aiding in code generation, explanation, and other programming-related tasks Things to try One interesting aspect of the dbrx-instruct-4bit model is its fine-grained MoE architecture, which allows it to flexibly combine a large number of smaller experts to improve performance. You could experiment with providing the model with diverse prompts and instructions to see how it leverages this capability. Additionally, the model's strong performance on benchmarks like the Databricks Model Gauntlet suggests it may be useful for a wide range of language understanding and reasoning tasks.

Updated 9/6/2024

Text-to-Text

🤷

c4ai-command-r-plus-4bit

mlx-community

The c4ai-command-r-plus-4bit model is a 4-bit quantized version of the c4ai-command-r-plus model, which is a highly advanced 104 billion parameter generative model developed by Cohere and Cohere For AI. This model has been optimized for a variety of use cases including reasoning, summarization, and question answering. Key capabilities include retrieval augmented generation (RAG) and multi-step tool use to automate complex tasks. Model inputs and outputs Inputs Text**: The model takes text as input only. Outputs Text**: The model generates text outputs. Capabilities The c4ai-command-r-plus-4bit model has advanced tool use and grounded generation capabilities. It can use a set of provided tools, such as an internet search, to research and generate responses. It can also produce grounded answers that cite relevant information sources. What can I use it for? The c4ai-command-r-plus-4bit model could be used for a variety of natural language processing tasks, such as question answering, summarization, and open-ended dialog. Its tool use and grounded generation capabilities make it well-suited for automating complex, multi-step tasks. Potential use cases include virtual assistants, research aids, and knowledge-intensive applications. Things to try Some interesting things to explore with the c4ai-command-r-plus-4bit model include experimenting with the different tool use prompts, testing its multilingual capabilities, and assessing its performance on specialized tasks like code generation or legal analysis. The model's advanced reasoning abilities could also be leveraged for creative applications like story generation or task planning.

Updated 9/6/2024

Text-to-Text