Skunkworksai

Models by this creator

👀

BakLLaVA-1

370

BakLLaVA-1 is a large language model developed by SkunkworksAI that combines the Mistral 7B base with the LLaVA 1.5 architecture. It showcases that the Mistral 7B base outperforms the Llama 2 13B model on several benchmarks. This first version of BakLLaVA is fully open-source but was trained on data that includes the LLaVA corpus, which has licensing restrictions. An upcoming version, BakLLaVA-2, will use a larger and commercially viable dataset along with a novel architecture. Model Inputs and Outputs BakLLaVA-1 is a text-to-image generation model that takes in text prompts and outputs corresponding images. The model was trained on a diverse dataset of over 1 million image-text pairs from sources like LAION, CC, SBU, and ShareGPT. Inputs Text prompt describing the desired image Outputs Generated image based on the input text prompt Capabilities BakLLaVA-1 demonstrates strong text-to-image generation capabilities, outperforming the Llama 2 13B model on several benchmarks according to the maintainer. The model can generate a wide variety of images from detailed textual descriptions. What Can I Use It For? BakLLaVA-1 can be used for various text-to-image generation tasks, such as creating custom illustrations, generating product images, or visualizing creative ideas. The model's open-source nature and strong performance make it a potentially useful tool for researchers, artists, and developers working on visual AI applications. Things to Try One interesting aspect of BakLLaVA-1 is its use of the LLaVA 1.5 architecture, which combines a large language model with a vision encoder. This allows the model to efficiently leverage both textual and visual information, potentially leading to more coherent and realistic image generation. Researchers and developers may want to experiment with fine-tuning or adapting the model for their specific use cases to take advantage of these multimodal capabilities.

Updated 5/23/2024

Text-to-Image

🌀

phi-2

SkunkworksAI

132

phi-2 is a 2.7 billion parameter language model from Microsoft's Skunkworks AI team. It builds upon their previous phi-1.5 model, using the same data sources augmented with new synthetic data and filtered web content. When tested on benchmarks of common sense, language understanding, and logical reasoning, phi-2 demonstrated state-of-the-art performance among models under 13 billion parameters. Unlike phi-1.5, phi-2 has not been fine-tuned for instruction following or through reinforcement learning from human feedback. Instead, the goal is to provide the research community with a non-restricted small model to explore safety challenges like reducing toxicity, understanding biases, and enhancing controllability. Model inputs and outputs Inputs Text prompts in a variety of formats, including question-answer, chat, and code Outputs Generated text responses to the input prompts Capabilities phi-2 exhibits strong performance on language tasks like question answering, dialogue, and code generation. However, it may produce inaccurate statements or code snippets, so users should treat the outputs as starting points rather than definitive solutions. The model also struggles with adhering to complex instructions, as it has not been fine-tuned for this purpose. What can I use it for? As an open-source research model, phi-2 is intended for exploring model safety and capabilities, rather than direct deployment in production applications. Researchers can use it to study techniques for reducing toxicity, mitigating biases, and improving controllability of language models. Developers may also find it useful as a building block for prototyping conversational AI features, though they should be cautious about relying on the model's outputs without thorough verification. Things to try One interesting aspect of phi-2 is its ability to generate code in response to prompts. Developers can experiment with giving the model code-related prompts, such as asking it to write a function to solve a specific problem. However, they should be mindful of the model's limitations in this area and verify the generated code before using it.

Updated 5/28/2024

Text-to-Text