GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon

Read original: arXiv:2406.02184 - Published 6/5/2024 by Sanhita Pathak, Vinay Kaushik, Brejesh Lall

GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon

Overview

This paper proposes GraVITON, a graph-based approach for virtual try-on that uses attention-guided inversion to warp garments onto a target person.
The method leverages a graph representation of the garment to capture its spatial structure and deformation, enabling more realistic and accurate virtual try-on.
GraVITON outperforms existing virtual try-on methods in terms of visual quality and perceptual realism.

Plain English Explanation

GraVITON is a new technique for virtual try-on, which allows you to see how clothing would look on you without actually putting it on. The key innovation is that GraVITON uses a graph-based representation of the garment. This means it models the clothing as a network of connected points, rather than just a flat image. This graph structure helps the system better understand the 3D shape and deformation of the garment, leading to more realistic virtual try-ons.

The system also uses "attention-guided inversion" to warp the garment onto the target person. This means it focuses on the most important parts of the garment and person when warping, rather than just trying to match everything. This makes the process more accurate and natural-looking.

Compared to previous virtual try-on methods, GraVITON produces try-ons that look and feel more realistic and true-to-life. This could be very helpful for online shopping, where being able to visualize how clothes will fit is a major challenge. By making virtual try-ons more convincing, GraVITON could improve the shopping experience and reduce returns.

Technical Explanation

The key innovation in GraVITON is its use of a graph-based representation of the garment, which allows it to better capture the 3D structure and deformation of the clothing. The garment is modeled as a graph, with nodes representing key points on the fabric and edges representing the connections between them. This graph structure enables GraVITON to reason about the spatial relationships and deformations of the garment in a more principled way than previous approaches that treated the garment as a 2D image.

To warp the garment onto the target person, GraVITON uses an attention-guided inversion process. This means it focuses on the most relevant parts of the garment and person when computing the warp, rather than trying to match every single pixel. The attention mechanism helps the system identify the most important regions to align, leading to more natural and realistic virtual try-ons.

GraVITON's graph-based representation and attention-guided inversion are implemented using deep learning models, including a graph neural network and a latent diffusion architecture. The authors demonstrate that this approach outperforms previous state-of-the-art virtual try-on methods in terms of visual quality and perceptual realism, as evaluated by human raters.

Critical Analysis

The GraVITON paper presents a compelling approach to virtual try-on that addresses some key limitations of prior methods. The use of a graph-based garment representation and attention-guided inversion are innovative and seem to yield meaningful improvements in realism and accuracy.

That said, the paper does not provide a detailed analysis of the computational complexity or inference time of the GraVITON system. This is an important practical consideration, as virtual try-on needs to be responsive and efficient to provide a good user experience. The authors also do not discuss potential failure cases or limitations of their approach, such as how it might handle highly complex or deformable garments.

Additionally, while the human evaluation results are promising, it would be helpful to see more quantitative metrics to better understand the magnitude of the improvements over previous methods. The authors could also explore the generalization of their approach to a wider range of garment types and body shapes.

Overall, GraVITON represents an interesting advance in virtual try-on technology, but there are still some open questions and areas for further research and refinement.

Conclusion

The GraVITON system presents a novel graph-based approach to virtual try-on that leverages attention-guided inversion to achieve more realistic and accurate results. By modeling the garment as a graph structure and focusing on the most relevant regions during warping, GraVITON outperforms previous state-of-the-art methods in terms of visual quality and perceptual realism.

This work has promising implications for improving the online shopping experience, where virtual try-on is a crucial but challenging feature. By making virtual try-ons more convincing and true-to-life, GraVITON could help reduce the number of returns and increase customer satisfaction. Further research into the computational efficiency, generalization, and potential limitations of the approach could help refine and expand the capabilities of this innovative virtual try-on system.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon

Sanhita Pathak, Vinay Kaushik, Brejesh Lall

Virtual try-on, a rapidly evolving field in computer vision, is transforming e-commerce by improving customer experiences through precise garment warping and seamless integration onto the human body. While existing methods such as TPS and flow address the garment warping but overlook the finer contextual details. In this paper, we introduce a novel graph based warping technique which emphasizes the value of context in garment flow. Our graph based warping module generates warped garment as well as a coarse person image, which is utilised by a simple refinement network to give a coarse virtual tryon image. The proposed work exploits latent diffusion model to generate the final tryon, treating garment transfer as an inpainting task. The diffusion model is conditioned with decoupled cross attention based inversion of visual and textual information. We introduce an occlusion aware warping constraint that generates dense warped garment, without any holes and occlusion. Our method, validated on VITON-HD and Dresscode datasets, showcases substantial state-of-the-art qualitative and quantitative results showing considerable improvement in garment warping, texture preservation, and overall realism.

6/5/2024

✨

Single Stage Warped Cloth Learning and Semantic-Contextual Attention Feature Fusion for Virtual TryOn

Sanhita Pathak, Vinay Kaushik, Brejesh Lall

Image-based virtual try-on aims to fit an in-shop garment onto a clothed person image. Garment warping, which aligns the target garment with the corresponding body parts in the person image, is a crucial step in achieving this goal. Existing methods often use multi-stage frameworks to handle clothes warping, person body synthesis and tryon generation separately or rely on noisy intermediate parser-based labels. We propose a novel single-stage framework that implicitly learns the same without explicit multi-stage learning. Our approach utilizes a novel semantic-contextual fusion attention module for garment-person feature fusion, enabling efficient and realistic cloth warping and body synthesis from target pose keypoints. By introducing a lightweight linear attention framework that attends to garment regions and fuses multiple sampled flow fields, we also address misalignment and artifacts present in previous methods. To achieve simultaneous learning of warped garment and try-on results, we introduce a Warped Cloth Learning Module. Our proposed approach significantly improves the quality and efficiency of virtual try-on methods, providing users with a more reliable and realistic virtual try-on experience.

5/28/2024

A Novel Garment Transfer Method Supervised by Distilled Knowledge of Virtual Try-on Model

Naiyu Fang, Lemiao Qiu, Shuyou Zhang, Zili Wang, Kerui Hu, Jianrong Tan

This paper proposes a novel garment transfer method supervised with knowledge distillation from virtual try-on. Our method first reasons the transfer parsing to provide shape prior to downstream tasks. We employ a multi-phase teaching strategy to supervise the training of the transfer parsing reasoning model, learning the response and feature knowledge from the try-on parsing reasoning model. To correct the teaching error, it transfers the garment back to its owner to absorb the hard knowledge in the self-study phase. Guided by the transfer parsing, we adjust the position of the transferred garment via STN to prevent distortion. Afterward, we estimate a progressive flow to precisely warp the garment with shape and content correspondences. To ensure warping rationality, we supervise the training of the garment warping model using target shape and warping knowledge from virtual try-on. To better preserve body features in the transfer result, we propose a well-designed training strategy for the arm regrowth task to infer new exposure skin. Experiments demonstrate that our method has state-of-the-art performance compared with other virtual try-on and garment transfer methods in garment transfer, especially for preserving garment texture and body features.

4/5/2024

VTON-IT: Virtual Try-On using Image Translation

Santosh Adhikari, Bishnu Bhusal, Prashant Ghimire, Anil Shrestha

Virtual Try-On (trying clothes virtually) is a promising application of the Generative Adversarial Network (GAN). However, it is an arduous task to transfer the desired clothing item onto the corresponding regions of a human body because of varying body size, pose, and occlusions like hair and overlapped clothes. In this paper, we try to produce photo-realistic translated images through semantic segmentation and a generative adversarial architecture-based image translation network. We present a novel image-based Virtual Try-On application VTON-IT that takes an RGB image, segments desired body part, and overlays target cloth over the segmented body region. Most state-of-the-art GAN-based Virtual Try-On applications produce unaligned pixelated synthesis images on real-life test images. However, our approach generates high-resolution natural images with detailed textures on such variant images.

5/8/2024