Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification

Read original: arXiv:2403.19902 - Published 5/7/2024 by Jianfeng Cai, Yue Ma, Zhixi Feng, Shuyuan Yang

Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification

Introduction

The provided text introduces polarimetric synthetic aperture radar (PolSAR), an active remote sensing technology that can obtain more information than conventional single-polarization synthetic aperture radar (SAR). PolSAR works by using different polarimetric combinations of transmitting and receiving backscattered waves from land covers. This capability allows PolSAR to observe targets in all weather conditions and at any time.

The text highlights that PolSAR image classification, which is the most crucial task in PolSAR image interpretation, has been widely used in various fields such as geography, agriculture, and environmental monitoring.

Figure 1: Visual comparison of instance similarity between PolSAR and optical images, with PolSAR images on the left and optical images on the right.

The provided text discusses the limitations of existing PolSAR (Polarimetric Synthetic Aperture Radar) classification methods and proposes a novel approach called Heterogeneous Contrastive Learning Network (HCLNet). The key points are:

Existing PolSAR classification methods using hand-crafted features or deep learning have limitations in fully representing the data or requiring large labeled datasets.
Self-supervised contrastive learning (CL) methods, popular for optical images, face challenges when applied to PolSAR due to differences in data characteristics.
The proposed HCLNet aims to overcome these challenges by effectively utilizing multi-features in PolSAR, addressing pixel similarity issues, and handling scattering confusion problems.
HCLNet introduces a heterogeneous network with an online 2D CNN and a target 1D CNN to learn representations from different PolSAR features without labeled data.
It employs a novel superpixel-based instance discrimination task to reduce pixel similarity and facilitate better representation learning.
A feature filter selects complementary features and reduces redundancy among multi-features.
Experimental results on benchmark PolSAR datasets demonstrate HCLNet's state-of-the-art performance in few-shot and full-sample scenarios.

The paper does not provide any information about the subsequent sections.

Related Work

The section discusses contrastive learning (CL) for self-supervised representation learning in polarimetric synthetic aperture radar (PolSAR) images. Key points:

CL is a prevalent self-supervised learning method. It uses two networks - an online network to be fine-tuned for downstream tasks, and a target network to aid learning. The goal is to make representations from the same image similar, and different images dissimilar.
The section categorizes CL approaches for optical images based on parameter updating and negative sample selection methods. Some examples are Instance Discrimination, CPC, CMC, MoCo, SimCLR, BYOL, and SimSiam.
For PolSAR images, methods like MI-SSL, PCLNet, SSPRL, and TCSPANet tailor CL by exploiting polarimetric properties.
The paper proposes a heterogeneous CL network using superpixel instance discrimination for negative sample selection.
It discusses various physical and statistical features extractable from PolSAR data like target decompositions, coherency and covariance matrices.
Proper feature selection is important, and the paper proposes a "feature filter" method to obtain a good complementary feature combination.

The section provides an overview of CL methods for representation learning, their adaptations for PolSAR data, and the use of polarimetric features, setting up the proposed approach.

Heterogeneous Network based Contrastive Learning

Figure 2: The overall framework of the proposed HCLNet. It mainly contains two processes: Pretraining and Fine-tuning. In pretraining, it first uses Feature Filter to combinate features, then constructs the heterogeneous network and uses Superpixel-based Instance Discrimination to learn the high-level representation. In fine-tuning, it uses the trained online network from pretraining and fine-tunes it with a small number of labeled data to better fit the downstream distribution.

Figure 3: The architecture of the heterogeneous network in HCLNet. It contains two networks with different architectures and is updated with InfoNCE loss. The output of the target network belonging to different superpixels in the same minibatch will be served as negative samples.

Figure 4: The architecture of the online network in the heterogeneous network. It contains the representation encoder and the projection head; the former will be used for fine-tuning.

The section details the proposed HCLNet (Heterogeneous Contrastive Learning Network) architecture for PolSAR (Polarimetric Synthetic Aperture Radar) data representation learning. HCLNet comprises three main components:

Feature Filter: A 1D CNN classifier evaluates different combinations of target decomposition features to select an appropriate complementary set of features.
Superpixel-based Instance Discrimination: An improved approach to select positive and negative sample pixels based on superpixel segmentation for the contrastive learning task.
Heterogeneous Network: The core component consisting of two networks with different architectures - a 2D CNN online network for the coherency matrix input and a 1D CNN target network for the filtered target decomposition features. This heterogeneous design learns high-level representations by capturing both spatial and feature-level differences between PolSAR instances.

The feature filter performs beam search to iteratively remove feature groups that minimally impact classification accuracy, ensuring complementarity. The superpixel-based instance discrimination redefines positive and negative samples within and across superpixels for contrastive learning. The heterogeneous network learns representations by maximizing agreement between the online and target network outputs for positive sample pairs using an InfoNCE loss. After pre-training, the online network can serve as a backbone for downstream PolSAR classification tasks.

V Experimental Results and Analysis

The paper employs three standard PolSAR datasets (RADARSAT-2 Flevoland, AIRSAR Flevoland, and ESAR Oberpfaffenhofen) to verify the effectiveness of the proposed HCLNet method. Details of these datasets and the ground truth images are provided.

The experimental settings, including implementation details, multi-feature extraction, and compared methods, are described. The proposed HCLNet method is compared against several state-of-the-art semi-supervised and PolSAR classification methods (MI-SSL, CF-CSSL, PCLNet, and SSPRL) on the three datasets.

The results show that HCLNet achieves superior classification accuracy, overall accuracy (OA), average accuracy (AA), and kappa coefficient on all three datasets for both few-shot and full-sample settings. Detailed quantitative results and classification maps are provided to demonstrate HCLNet's improvement over other methods, especially in alleviating scattering confusion.

The t-SNE visualization of the learned representations further confirms HCLNet's ability to distinguish different categories well and reduce overlapping/disconnection compared to other methods.

The model complexity analysis shows that HCLNet has relatively low parameter count and floating-point operations (FLOPs) compared to other complex methods.

An ablation study validates the effectiveness of each component (heterogeneous network, superpixel-based instance discrimination, and feature filter) in HCLNet's design.

Overall, the experimental results across multiple datasets and analyses demonstrate the generalization, high classification accuracy, and effectiveness of the proposed HCLNet method for PolSAR image classification.

Conclusion

The paper proposes a self-supervised learning method called HCLNet (Heterogeneous Contrastive Learning Network) for polarimetric synthetic aperture radar (PolSAR) data classification. It introduces two plugins to enhance the learning process: a feature filter to reduce redundancy in multi-target decomposition features, and a superpixel-based instance discrimination technique to learn better representations by reducing similarity between pixels.

HCLNet leverages unsupervised pre-training to extract high-level representations from the physical and statistical features of PolSAR data. After this pre-training, the network can achieve high performance in few-shot PolSAR classification tasks through fine-tuning.

Experiments on three benchmark datasets demonstrate HCLNet's superiority over mainstream methods for both few-shot and full-sample PolSAR classification. The paper highlights that PolSAR data has more valuable features than optical images, and heterogeneous networks have a natural advantage in utilizing these features.

This work sets a precedent for future research on heterogeneous network learning for PolSAR data. The authors plan to explore positive and negative selection problems for heterogeneous networks in depth to further improve PolSAR classifier performance.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Heterogeneous Network Based Contrastive Learning Method for PolSAR Land Cover Classification

Jianfeng Cai, Yue Ma, Zhixi Feng, Shuyuan Yang

Polarimetric synthetic aperture radar (PolSAR) image interpretation is widely used in various fields. Recently, deep learning has made significant progress in PolSAR image classification. Supervised learning (SL) requires a large amount of labeled PolSAR data with high quality to achieve better performance, however, manually labeled data is insufficient. This causes the SL to fail into overfitting and degrades its generalization performance. Furthermore, the scattering confusion problem is also a significant challenge that attracts more attention. To solve these problems, this article proposes a Heterogeneous Network based Contrastive Learning method(HCLNet). It aims to learn high-level representation from unlabeled PolSAR data for few-shot classification according to multi-features and superpixels. Beyond the conventional CL, HCLNet introduces the heterogeneous architecture for the first time to utilize heterogeneous PolSAR features better. And it develops two easy-to-use plugins to narrow the domain gap between optics and PolSAR, including feature filter and superpixel-based instance discrimination, which the former is used to enhance the complementarity of multi-features, and the latter is used to increase the diversity of negative samples. Experiments demonstrate the superiority of HCLNet on three widely used PolSAR benchmark datasets compared with state-of-the-art methods. Ablation studies also verify the importance of each component. Besides, this work has implications for how to efficiently utilize the multi-features of PolSAR data to learn better high-level representation in CL and how to construct networks suitable for PolSAR data better.

5/7/2024

Deep Learning Based Speckle Filtering for Polarimetric SAR Images. Application to Sentinel-1

Alejandro Mestre-Quereda, Juan M. Lopez-Sanchez

Speckle suppression in synthetic aperture radar (SAR) images is a key processing step which continues to be a research topic. A wide variety of methods, using either spatially-based approaches or transform-based strategies, have been developed and have shown to provide outstanding results. However, recent advances in deep learning techniques and their application to SAR image despeckling have been demonstrated to offer state-of-the-art results. Unfortunately, they have been mostly applied to single-polarimetric images. The extension of a deep learning-based approach for speckle removal to polarimetric SAR (PolSAR) images is complicated because of the complex nature of the measured covariance matrices for every image pixel, the properties of which must be preserved during filtering. In this work, we propose a complete framework to remove speckle in polarimetric SAR images using a convolutional neural network. The methodology includes a reversible transformation of the original complex covariance matrix to obtain a set of real-valued intensity bands which are fed to the neural network. In addition, the proposed method includes a change detection strategy to avoid the neural network to learn erroneous features in areas strongly affected by temporal changes, so that the network only learns the underlying speckle component present in the data. The method is implemented and tested with dual-polarimetric images acquired by Sentinel-1. Experiments show that the proposed approach offers exceptional results in both speckle reduction and resolution preservation. More importantly, it is also shown that the neural network is not generating artifacts or introducing bias in the filtered images, making them suitable for further polarimetric processing and exploitation.

8/30/2024

Hierarchical Attention and Parallel Filter Fusion Network for Multi-Source Data Classification

Han Luo, Feng Gao, Junyu Dong, Lin Qi

Hyperspectral image (HSI) and synthetic aperture radar (SAR) data joint classification is a crucial and yet challenging task in the field of remote sensing image interpretation. However, feature modeling in existing methods is deficient to exploit the abundant global, spectral, and local features simultaneously, leading to sub-optimal classification performance. To solve the problem, we propose a hierarchical attention and parallel filter fusion network for multi-source data classification. Concretely, we design a hierarchical attention module for hyperspectral feature extraction. This module integrates global, spectral, and local features simultaneously to provide more comprehensive feature representation. In addition, we develop parallel filter fusion module which enhances cross-modal feature interactions among different spatial locations in the frequency domain. Extensive experiments on two multi-source remote sensing data classification datasets verify the superiority of our proposed method over current state-of-the-art classification approaches. Specifically, our proposed method achieves 91.44% and 80.51% of overall accuracy (OA) on the respective datasets, highlighting its superior performance.

8/26/2024

🖼️

Learning transformer-based heterogeneously salient graph representation for multimodal remote sensing image classification

Jiaqi Yang, Bo Du, Liangpei Zhang

Data collected by different modalities can provide a wealth of complementary information, such as hyperspectral image (HSI) to offer rich spectral-spatial properties, synthetic aperture radar (SAR) to provide structural information about the Earth's surface, and light detection and ranging (LiDAR) to cover altitude information about ground elevation. Therefore, a natural idea is to combine multimodal images for refined and accurate land-cover interpretation. Although many efforts have been attempted to achieve multi-source remote sensing image classification, there are still three issues as follows: 1) indiscriminate feature representation without sufficiently considering modal heterogeneity, 2) abundant features and complex computations associated with modeling long-range dependencies, and 3) overfitting phenomenon caused by sparsely labeled samples. To overcome the above barriers, a transformer-based heterogeneously salient graph representation (THSGR) approach is proposed in this paper. First, a multimodal heterogeneous graph encoder is presented to encode distinctively non-Euclidean structural features from heterogeneous data. Then, a self-attention-free multi-convolutional modulator is designed for effective and efficient long-term dependency modeling. Finally, a mean forward is put forward in order to avoid overfitting. Based on the above structures, the proposed model is able to break through modal gaps to obtain differentiated graph representation with competitive time cost, even for a small fraction of training samples. Experiments and analyses on three benchmark datasets with various state-of-the-art (SOTA) methods show the performance of the proposed approach.

6/11/2024