IIDM: Inter and Intra-domain Mixing for Semi-supervised Domain Adaptation in Semantic Segmentation

2308.15855

Published 4/12/2024 by Weifu Fu, Qiang Nie, Jialin Li, Yuhuan Lin, Kai Wu, Jian Li, Yabiao Wang, Yong Liu, Chengjie Wang

cs.CV

💬

Abstract

Despite recent advances in semantic segmentation, an inevitable challenge is the performance degradation caused by the domain shift in real applications. Current dominant approach to solve this problem is unsupervised domain adaptation (UDA). However, the absence of labeled target data in UDA is overly restrictive and limits performance. To overcome this limitation, a more practical scenario called semi-supervised domain adaptation (SSDA) has been proposed. Existing SSDA methods are derived from the UDA paradigm and primarily focus on leveraging the unlabeled target data and source data. In this paper, we highlight the significance of exploiting the intra-domain information between the labeled target data and unlabeled target data. Instead of solely using the scarce labeled target data for supervision, we propose a novel SSDA framework that incorporates both Inter and Intra Domain Mixing (IIDM), where inter-domain mixing mitigates the source-target domain gap and intra-domain mixing enriches the available target domain information, and the network can capture more domain-invariant features. We also explore different domain mixing strategies to better exploit the target domain information. Comprehensive experiments conducted on the GTA5 to Cityscapes and SYNTHIA to Cityscapes benchmarks demonstrate the effectiveness of IIDM, surpassing previous methods by a large margin.

Create account to get full access

Overview

Recent advances in semantic segmentation have led to performance degradation when applied to real-world scenarios due to the domain shift problem.
The current dominant approach to solve this issue is unsupervised domain adaptation (UDA), but the absence of labeled target data limits its performance.
To address this limitation, a more practical scenario called semi-supervised domain adaptation (SSDA) has been proposed.
Existing SSDA methods focus on leveraging the unlabeled target data and source data, but they overlook the significance of exploiting the intra-domain information between the labeled and unlabeled target data.

Plain English Explanation

Semantic segmentation is a computer vision technique that assigns a label to each pixel in an image, identifying the objects and regions present. While this technology has advanced significantly, a key challenge is that its performance can degrade when applied to real-world scenarios, where the data may be different from the training data.

To address this "domain shift" problem, researchers have developed unsupervised domain adaptation (UDA) techniques, which aim to adapt the model to the new domain without requiring labeled data from that domain. However, the lack of labeled target data in UDA can still limit the model's performance.

To overcome this, a new approach called semi-supervised domain adaptation (SSDA) has been proposed, which leverages a small amount of labeled target data along with the unlabeled target data and source data. Existing SSDA methods have focused on using the unlabeled target data and source data, but they have overlooked the potential value of the limited labeled target data and how it can be used to better understand the target domain.

Technical Explanation

This paper introduces a novel SSDA framework called Inter and Intra Domain Mixing (IIDM), which aims to exploit both the inter-domain information (between source and target domains) and the intra-domain information (between labeled and unlabeled target data) to improve the model's performance.

The key idea is to use a mixing strategy that combines the source data, labeled target data, and unlabeled target data in a way that helps the model capture more domain-invariant features. The inter-domain mixing component helps bridge the gap between the source and target domains, while the intra-domain mixing component enriches the available information about the target domain.

The authors explore different domain mixing strategies and conduct comprehensive experiments on the GTA5 to Cityscapes and SYNTHIA to Cityscapes benchmarks. The results show that IIDM significantly outperforms previous SSDA methods, demonstrating the effectiveness of their approach.

Critical Analysis

The paper presents a promising solution to the domain shift problem in semantic segmentation by leveraging both the inter-domain and intra-domain information in a semi-supervised setting. However, there are a few potential limitations and areas for further research:

The paper focuses on semantic segmentation, but the IIDM framework could potentially be applied to other computer vision tasks that suffer from domain shift, such as instance-aware domain adaptation or unsupervised domain adaptation for remote sensing. Exploring the generalizability of IIDM to these other domains could be an interesting direction.
The paper does not provide an in-depth analysis of the individual contributions of the inter-domain and intra-domain mixing components. Understanding the relative importance of these two aspects could help inform the design of more effective SSDA algorithms in the future.
The experiments are conducted on well-established benchmarks, but it would be valuable to see the performance of IIDM on more diverse and challenging real-world datasets to further validate its robustness.
The paper does not discuss the computational complexity or training time of the IIDM framework compared to other SSDA methods. This information could be useful for practitioners when selecting the appropriate technique for their application.

Overall, the IIDM framework presented in this paper represents a significant advancement in semi-supervised domain adaptation for semantic segmentation and serves as a strong foundation for future research in this area.

Conclusion

This paper introduces a novel semi-supervised domain adaptation (SSDA) framework called Inter and Intra Domain Mixing (IIDM) to address the domain shift problem in semantic segmentation. By leveraging both the inter-domain information between the source and target domains, as well as the intra-domain information between the labeled and unlabeled target data, IIDM demonstrates superior performance compared to previous SSDA methods.

The key contribution of this work is the recognition of the importance of exploiting the limited labeled target data, which existing SSDA approaches have overlooked. The comprehensive experiments on the GTA5 to Cityscapes and SYNTHIA to Cityscapes benchmarks validate the effectiveness of the IIDM framework, paving the way for further advancements in domain adaptation for computer vision tasks.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Lei Qi, Qian Yu, Yinghuan Shi, Yang Gao

Both limited annotation and domain shift are prevalent challenges in medical image segmentation. Traditional semi-supervised segmentation and unsupervised domain adaptation methods address one of these issues separately. However, the coexistence of limited annotation and domain shift is quite common, which motivates us to introduce a novel and challenging scenario: Mixed Domain Semi-supervised medical image Segmentation (MiDSS). In this scenario, we handle data from multiple medical centers, with limited annotations available for a single domain and a large amount of unlabeled data from multiple domains. We found that the key to solving the problem lies in how to generate reliable pseudo labels for the unlabeled data in the presence of domain shift with labeled data. To tackle this issue, we employ Unified Copy-Paste (UCP) between images to construct intermediate domains, facilitating the knowledge transfer from the domain of labeled data to the domains of unlabeled data. To fully utilize the information within the intermediate domain, we propose a symmetric Guidance training strategy (SymGD), which additionally offers direct guidance to unlabeled data by merging pseudo labels from intermediate samples. Subsequently, we introduce a Training Process aware Random Amplitude MixUp (TP-RAM) to progressively incorporate style-transition components into intermediate samples. Compared with existing state-of-the-art approaches, our method achieves a notable 13.57% improvement in Dice score on Prostate dataset, as demonstrated on three public datasets. Our code is available at https://github.com/MQinghe/MiDSS .

4/16/2024

cs.CV cs.LG

Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation

Elham Amin Mansour, Ozan Unal, Suman Saha, Benjamin Bejar, Luc Van Gool

The increasing relevance of panoptic segmentation is tied to the advancements in autonomous driving and AR/VR applications. However, the deployment of such models has been limited due to the expensive nature of dense data annotation, giving rise to unsupervised domain adaptation (UDA). A key challenge in panoptic UDA is reducing the domain gap between a labeled source and an unlabeled target domain while harmonizing the subtasks of semantic and instance segmentation to limit catastrophic interference. While considerable progress has been achieved, existing approaches mainly focus on the adaptation of semantic segmentation. In this work, we focus on incorporating instance-level adaptation via a novel instance-aware cross-domain mixing strategy IMix. IMix significantly enhances the panoptic quality by improving instance segmentation performance. Specifically, we propose inserting high-confidence predicted instances from the target domain onto source images, retaining the exhaustiveness of the resulting pseudo-labels while reducing the injected confirmation bias. Nevertheless, such an enhancement comes at the cost of degraded semantic performance, attributed to catastrophic forgetting. To mitigate this issue, we regularize our semantic branch by employing CLIP-based domain alignment (CDA), exploiting the domain-robustness of natural language prompts. Finally, we present an end-to-end model incorporating these two mechanisms called LIDAPS, achieving state-of-the-art results on all popular panoptic UDA benchmarks.

4/8/2024

cs.CV cs.AI

🤷

Multi-Target Unsupervised Domain Adaptation for Semantic Segmentation without External Data

Yonghao Xu, Pedram Ghamisi, Yannis Avrithis

Multi-target unsupervised domain adaptation (UDA) aims to learn a unified model to address the domain shift between multiple target domains. Due to the difficulty of obtaining annotations for dense predictions, it has recently been introduced into cross-domain semantic segmentation. However, most existing solutions require labeled data from the source domain and unlabeled data from multiple target domains concurrently during training. Collectively, we refer to this data as external. When faced with new unlabeled data from an unseen target domain, these solutions either do not generalize well or require retraining from scratch on all data. To address these challenges, we introduce a new strategy called multi-target UDA without external data for semantic segmentation. Specifically, the segmentation model is initially trained on the external data. Then, it is adapted to a new unseen target domain without accessing any external data. This approach is thus more scalable than existing solutions and remains applicable when external data is inaccessible. We demonstrate this strategy using a simple method that incorporates self-distillation and adversarial learning, where knowledge acquired from the external data is preserved during adaptation through one-way adversarial learning. Extensive experiments in several synthetic-to-real and real-to-real adaptation settings on four benchmark urban driving datasets show that our method significantly outperforms current state-of-the-art solutions, even in the absence of external data. Our source code is available online (https://github.com/YonghaoXu/UT-KD).

5/13/2024

cs.CV

Style Adaptation for Domain-adaptive Semantic Segmentation

Ting Li, Jianshu Chao, Deyu An

Unsupervised Domain Adaptation (UDA) refers to the method that utilizes annotated source domain data and unlabeled target domain data to train a model capable of generalizing to the target domain data. Domain discrepancy leads to a significant decrease in the performance of general network models trained on the source domain data when applied to the target domain. We introduce a straightforward approach to mitigate the domain discrepancy, which necessitates no additional parameter calculations and seamlessly integrates with self-training-based UDA methods. Through the transfer of the target domain style to the source domain in the latent feature space, the model is trained to prioritize the target domain style during the decision-making process. We tackle the problem at both the image-level and shallow feature map level by transferring the style information from the target domain to the source domain data. As a result, we obtain a model that exhibits superior performance on the target domain. Our method yields remarkable enhancements in the state-of-the-art performance for synthetic-to-real UDA tasks. For example, our proposed method attains a noteworthy UDA performance of 76.93 mIoU on the GTA->Cityscapes dataset, representing a notable improvement of +1.03 percentage points over the previous state-of-the-art results.

4/26/2024

cs.CV