Runwayml

Models by this creator

🎯

stable-diffusion-v1-5

10.8K

stable-diffusion-v1-5 is a latent text-to-image diffusion model developed by runwayml that can generate photo-realistic images from text prompts. It was initialized with the weights of the Stable-Diffusion-v1-2 checkpoint and then fine-tuned on 595k steps at 512x512 resolution on the "laion-aesthetics v2 5+" dataset. This fine-tuning included a 10% dropping of the text-conditioning to improve classifier-free guidance sampling. Similar models include the Stable-Diffusion-v1-4 checkpoint, which was trained on 225k steps at 512x512 resolution on "laion-aesthetics v2 5+" with 10% text-conditioning dropping, as well as the coreml-stable-diffusion-v1-5 model, which is a version of the stable-diffusion-v1-5 model converted for use on Apple Silicon hardware. Model inputs and outputs Inputs Text prompt**: A textual description of the desired image to generate. Outputs Generated image**: A photo-realistic image that matches the provided text prompt. Capabilities The stable-diffusion-v1-5 model can generate a wide variety of photo-realistic images from text prompts. For example, it can create images of imaginary scenes, like "a photo of an astronaut riding a horse on mars", as well as more realistic images, like "a photo of a yellow cat sitting on a park bench". The model is able to capture details like lighting, textures, and composition, resulting in highly convincing and visually appealing outputs. What can I use it for? The stable-diffusion-v1-5 model is intended for research purposes only. Potential use cases include: Generating artwork and creative content for design, education, or personal projects (using the Diffusers library) Probing the limitations and biases of generative models Developing safe deployment strategies for models with the potential to generate harmful content The model should not be used to create content that is disturbing, offensive, or propagates harmful stereotypes. Excluded uses include generating demeaning representations, impersonating individuals without consent, or sharing copyrighted material. Things to try One interesting aspect of the stable-diffusion-v1-5 model is its ability to generate highly detailed and visually compelling images, even for complex or fantastical prompts. Try experimenting with prompts that combine multiple elements, like "a photo of a robot unicorn fighting a giant mushroom in a cyberpunk city". The model's strong grasp of composition and lighting can result in surprisingly coherent and imaginative outputs. Another area to explore is the model's flexibility in handling different styles and artistic mediums. Try prompts that reference specific art movements, like "a Monet-style painting of a sunset over a lake" or "a cubist portrait of a person". The model's latent diffusion approach allows it to capture a wide range of visual styles and aesthetics.

Updated 5/28/2024

Text-to-Image

❗

stable-diffusion-inpainting

runwayml

1.5K

stable-diffusion-inpainting is a latent text-to-image diffusion model developed by runwayml that is capable of generating photo-realistic images based on text inputs, with the added capability of inpainting - filling in masked parts of images. Similar models include the stable-diffusion-2-inpainting model from Stability AI, which was resumed from the stable-diffusion-2-base model and trained for inpainting, and the stable-diffusion-xl-1.0-inpainting-0.1 model from the Diffusers team, which was trained for high-resolution inpainting. Model inputs and outputs stable-diffusion-inpainting takes in a text prompt, an image, and a mask image as inputs. The mask image indicates which parts of the original image should be inpainted. The model then generates a new image that combines the original image with the inpainted content based on the text prompt. Inputs Prompt**: A text description of the desired image Image**: The original image to be inpainted Mask Image**: A binary mask indicating which parts of the original image should be inpainted (white for inpainting, black for keeping) Outputs Generated Image**: The new image with the inpainted content Capabilities stable-diffusion-inpainting can be used to fill in missing or corrupted parts of images while maintaining the overall composition and style. For example, you could use it to add a new object to a scene, replace a person in a photo, or fix damaged areas of an image. The model is able to generate highly realistic and cohesive results, leveraging the power of the Stable Diffusion text-to-image generation capabilities. What can I use it for? stable-diffusion-inpainting could be useful for a variety of creative and practical applications, such as: Restoring old or damaged photos Removing unwanted elements from images Compositing different visual elements together Experimenting with different variations of a scene or composition Generating concept art or illustrations for games, films, or other media The model's ability to maintain the overall aesthetic and coherence of an image while manipulating specific elements makes it a powerful tool for visual creativity and production. Things to try One interesting aspect of stable-diffusion-inpainting is its ability to preserve the non-masked parts of the original image while seamlessly blending in the new content. This can be used to create surreal or fantastical compositions, such as adding a tiger to a park bench or a spaceship to a landscape. By carefully selecting the mask regions and prompt, you can explore the boundaries of what the model can achieve in terms of image manipulation and generation.

Updated 5/28/2024

Image-to-Image