Fuyu-8B is a multi-modal text and image transformer model developed by Adept AI. It has a simple architecture compared to other multi-modal models, with a decoder-only transformer that linearly projects image patches into the first layer, bypassing the embedding lookup. This allows the model to handle arbitrary image resolutions without the need for separate high and low-resolution training stages. The model is optimized for digital agents, supporting tasks like answering questions about graphs and diagrams, UI-based questions, and fine-grained localization on screen images. Model inputs and outputs Inputs Text**: The model can consume text inputs. Images**: The model can also consume image inputs of arbitrary size, treating the image tokens like the sequence of text tokens. Outputs Text**: The model generates text outputs in response to the provided text and image inputs. Capabilities The Fuyu-8B model is designed to be a versatile multi-modal AI assistant. It can understand and reason about both text and images, enabling it to perform tasks like visual question answering, image captioning, and multimodal chat. The model's fast inference speed, with responses for large images in under 100 milliseconds, makes it well-suited for real-time applications. What can I use it for? The Fuyu-8B model can be a powerful tool for a variety of applications, such as: Digital Assistants**: The model's multi-modal capabilities and focus on supporting digital agents make it a great fit for building conversational AI assistants that can understand and respond to both text and image inputs. Content Creation**: The model can be used to generate creative text formats like poetry, scripts, and marketing copy, while also incorporating relevant visual elements. Visual Question Answering**: The model can be used to build applications that can answer questions about images, diagrams, and other visual content. Things to try One interesting aspect of the Fuyu-8B model is its ability to handle arbitrary image resolutions. This means you can experiment with feeding the model different image sizes and observe how it responds. You can also try fine-tuning the model on specific datasets or tasks to see how it adapts and improves its performance.

The persimmon-8b-chat model is an AI language model developed by Adept, a company working towards an AI agent that can help people with a wide range of computer-based tasks. This 8 billion parameter model was trained from scratch with a large context size of 16,000 tokens, which is four times larger than LLaMA2 and eight times larger than GPT-3 and MPT. The model has been fine-tuned for chat completion, making it well-suited for conversational tasks. Compared to similar models like the llama-2-7b-chat from Meta, the persimmon-8b-chat model has a significantly larger context size, which can be beneficial for maintaining coherence and context in longer conversations. The meta-llama-3-70b-instruct and meta-llama-3-8b-instruct models from Meta also offer large language models for conversational tasks, with the 70 billion parameter version being particularly impressive in scale. Model inputs and outputs The persimmon-8b-chat model takes natural language input in the form of queries or prompts, and generates responses in natural language. The model is designed to engage in open-ended chat, with the ability to maintain context and coherence across multiple turns of conversation. Inputs Natural language queries or prompts Outputs Natural language responses, generated based on the input and the model's understanding of the context Capabilities The persimmon-8b-chat model is capable of engaging in coherent and contextual chat, drawing upon its large knowledge base and conversational abilities. It can respond to a wide range of queries, ask and answer questions, and demonstrate empathy and personality as appropriate for the situation. What can I use it for? The persimmon-8b-chat model could be useful for a variety of applications that require natural language interaction, such as customer service chatbots, virtual assistants, or educational tools. The model's large context size and conversational abilities make it well-suited for tasks that require maintaining coherence and continuity over multiple turns of dialogue. Things to try One interesting aspect of the persimmon-8b-chat model is its ability to engage in long-form, contextual conversations due to its large context size. Users could experiment with prompting the model to maintain a coherent narrative or discussion over an extended period of time, testing its capabilities for sustained, engaging dialogue.

