Human-annotated label noise and their impact on ConvNets for remote sensing image scene classification

2305.12106

Published 5/1/2024 by Longkang Peng, Tao Wei, Xuehong Chen, Xiaobei Chen, Rui Sun, Luoma Wan, Jin Chen, Xiaolin Zhu

🖼️

Abstract

Convolutional neural networks (ConvNets) have been successfully applied to satellite image scene classification. Human-labeled training datasets are essential for ConvNets to perform accurate classification. Errors in human-annotated training datasets are unavoidable due to the complexity of satellite images. However, the distribution of real-world human-annotated label noises on remote sensing images and their impact on ConvNets have not been investigated. To fill this research gap, this study, for the first time, collected real-world labels from 32 participants and explored how their annotated label noise affect three representative ConvNets (VGG16, GoogleNet, and ResNet-50) for remote sensing image scene classification. We found that: (1) human-annotated label noise exhibits significant class and instance dependence; (2) an additional 1% of human-annotated label noise in training data leads to 0.5% reduction in the overall accuracy of ConvNets classification; (3) the error pattern of ConvNet predictions was strongly correlated with that of participant's labels. To uncover the mechanism underlying the impact of human labeling errors on ConvNets, we further compared it with three types of simulated label noise: uniform noise, class-dependent noise and instance-dependent noise. Our results show that the impact of human-annotated label noise on ConvNets significantly differs from all three types of simulated label noise, while both class dependence and instance dependence contribute to the impact of human-annotated label noise on ConvNets. These observations necessitate a reevaluation of the handling of noisy labels, and we anticipate that our real-world label noise dataset would facilitate the future development and assessment of label-noise learning algorithms.

Create account to get full access

Overview

Convolutional neural networks (ConvNets) are commonly used for satellite image scene classification tasks
Accurate classification relies on high-quality, human-annotated training datasets
Errors in human-annotated labels are unavoidable due to the complexity of satellite images
The impact of real-world human-annotated label noise on ConvNet performance is not well understood

Plain English Explanation

Convolutional neural networks (ConvNets) are a type of machine learning model that have been successfully used to classify scenes in satellite images. These models need to be trained on datasets where the images are labeled by humans to teach the model what different types of scenes look like. However, even with careful human labeling, mistakes in the labels are unavoidable because satellite images can be very complex and difficult to interpret.

This study looked at the impact of these real-world human-annotated label mistakes on the performance of three popular ConvNet models (VGG16, GoogleNet, and ResNet-50) for classifying remote sensing images. The researchers collected labels from 32 human participants and found that the label noise exhibited significant class and instance dependence. This means the mistakes were not random, but were related to the specific class or image being labeled.

They further discovered that just a 1% increase in human-annotated label noise led to a 0.5% reduction in the overall accuracy of the ConvNet models. Additionally, the errors made by the ConvNet models were strongly correlated with the errors made by the human participants.

To understand the mechanism behind this, the researchers compared the impact of the real-world human label noise to different types of simulated label noise. They found that the impact of the real-world noise was significantly different from all the simulated noise types, and that both class dependence and instance dependence contributed to the effect on the ConvNets.

These findings highlight the need to reevaluate how we handle noisy labels in training machine learning models for real-world applications like satellite image analysis. The researchers believe their dataset of real-world human-annotated label noise will help advance the development of better techniques for dealing with this challenge.

Technical Explanation

This study investigated the distribution and impact of real-world human-annotated label noise on the performance of three representative ConvNet models (VGG16, GoogleNet, and ResNet-50) for remote sensing image scene classification. The researchers collected labels from 32 human participants and analyzed the characteristics of the resulting label noise.

They found that the human-annotated label noise exhibited significant class and instance dependence, meaning the mistakes were not random but correlated with the specific class or image being labeled. Further analysis showed that an additional 1% of human-annotated label noise in the training data led to a 0.5% reduction in the overall accuracy of the ConvNet models.

Interestingly, the error pattern of the ConvNet predictions was strongly correlated with the error pattern in the participant labels, suggesting the models were heavily influenced by the inherent noise in the human-annotated training data.

To better understand this phenomenon, the researchers compared the impact of the real-world human label noise to three types of simulated label noise: uniform noise, class-dependent noise, and instance-dependent noise. They found that the impact of the real-world human-annotated label noise on the ConvNets differed significantly from all three simulated noise types. Both class dependence and instance dependence were found to contribute to the impact of the real-world label noise on the model performance.

Critical Analysis

The researchers acknowledge that the complexity of satellite images and the inherent challenges in human labeling make errors in training datasets unavoidable. However, they also note that the distribution and impact of these real-world noisy labels have not been well studied, which is an important gap in the literature.

This study provides valuable insights into the characteristics of real-world human-annotated label noise and its significant effects on ConvNet performance. The finding that the impact of the real-world noise differs from simulated noise types highlights the importance of using realistic datasets to assess and develop techniques for handling noisy labels.

One potential limitation of the study is the relatively small number of human participants (32) involved in the data collection. While the researchers were able to uncover important patterns, a larger and more diverse set of human annotators could provide even deeper insights into the distribution and effects of real-world label noise.

Additionally, the study focused on three specific ConvNet architectures, and it would be interesting to see if the findings hold true for a wider range of model types and applications. Exploring the impact of real-world label noise on other machine learning algorithms, such as contrastive-based deep embeddings or auto-encoder based image analysis, could provide a more comprehensive understanding of the broader implications.

Conclusion

This study provides valuable insights into the distribution and impact of real-world human-annotated label noise on the performance of ConvNet models for remote sensing image classification. The researchers found that the label noise exhibits significant class and instance dependence, and that a relatively small increase in noise can lead to a substantial reduction in model accuracy.

Importantly, the impact of the real-world label noise was found to differ significantly from simulated noise types, highlighting the need to use realistic datasets when assessing and developing techniques for handling noisy labels. The researchers believe their dataset of real-world human-annotated label noise will be a valuable resource for advancing research in this area, which is critical for improving the reliability and robustness of machine learning models in real-world applications like satellite image analysis.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Noisy Label Processing for Classification: A Survey

Mengting Li, Chuang Zhu

In recent years, deep neural networks (DNNs) have gained remarkable achievement in computer vision tasks, and the success of DNNs often depends greatly on the richness of data. However, the acquisition process of data and high-quality ground truth requires a lot of manpower and money. In the long, tedious process of data annotation, annotators are prone to make mistakes, resulting in incorrect labels of images, i.e., noisy labels. The emergence of noisy labels is inevitable. Moreover, since research shows that DNNs can easily fit noisy labels, the existence of noisy labels will cause significant damage to the model training process. Therefore, it is crucial to combat noisy labels for computer vision tasks, especially for classification tasks. In this survey, we first comprehensively review the evolution of different deep learning approaches for noisy label combating in the image classification task. In addition, we also review different noise patterns that have been proposed to design robust algorithms. Furthermore, we explore the inner pattern of real-world label noise and propose an algorithm to generate a synthetic label noise pattern guided by real-world data. We test the algorithm on the well-known real-world dataset CIFAR-10N to form a new real-world data-guided synthetic benchmark and evaluate some typical noise-robust methods on the benchmark.

4/8/2024

cs.CV cs.AI

Benchmarking Label Noise in Instance Segmentation: Spatial Noise Matters

Eden Grad, Moshe Kimhi, Lion Halika, Chaim Baskin

Obtaining accurate labels for instance segmentation is particularly challenging due to the complex nature of the task. Each image necessitates multiple annotations, encompassing not only the object's class but also its precise spatial boundaries. These requirements elevate the likelihood of errors and inconsistencies in both manual and automated annotation processes. By simulating different noise conditions, we provide a realistic scenario for assessing the robustness and generalization capabilities of instance segmentation models in different segmentation tasks, introducing COCO-N and Cityscapes-N. We also propose a benchmark for weakly annotation noise, dubbed COCO-WAN, which utilizes foundation models and weak annotations to simulate semi-automated annotation tools and their noisy labels. This study sheds light on the quality of segmentation masks produced by various models and challenges the efficacy of popular methods designed to address learning with label noise.

6/19/2024

cs.CV cs.LG

Task Specific Pretraining with Noisy Labels for Remote sensing Image Segmentation

Chenying Liu, Conrad M Albrecht, Yi Wang, Xiao Xiang Zhu

Compared to supervised deep learning, self-supervision provides remote sensing a tool to reduce the amount of exact, human-crafted geospatial annotations. While image-level information for unsupervised pretraining efficiently works for various classification downstream tasks, the performance on pixel-level semantic segmentation lags behind in terms of model accuracy. On the contrary, many easily available label sources (e.g., automatic labeling tools and land cover land use products) exist, which can provide a large amount of noisy labels for segmentation model training. In this work, we propose to exploit noisy semantic segmentation maps for model pretraining. Our experiments provide insights on robustness per network layer. The transfer learning settings test the cases when the pretrained encoders are fine-tuned for different label classes and decoders. The results from two datasets indicate the effectiveness of task-specific supervised pretraining with noisy labels. Our findings pave new avenues to improved model accuracy and novel pretraining strategies for efficient remote sensing image segmentation.

6/11/2024

cs.CV

👁️

NoiseBench: Benchmarking the Impact of Real Label Noise on Named Entity Recognition

Elena Merdjanovska, Ansar Aynetdinov, Alan Akbik

Available training data for named entity recognition (NER) often contains a significant percentage of incorrect labels for entity types and entity boundaries. Such label noise poses challenges for supervised learning and may significantly deteriorate model quality. To address this, prior work proposed various noise-robust learning approaches capable of learning from data with partially incorrect labels. These approaches are typically evaluated using simulated noise where the labels in a clean dataset are automatically corrupted. However, as we show in this paper, this leads to unrealistic noise that is far easier to handle than real noise caused by human error or semi-automatic annotation. To enable the study of the impact of various types of real noise, we introduce NoiseBench, an NER benchmark consisting of clean training data corrupted with 6 types of real noise, including expert errors, crowdsourcing errors, automatic annotation errors and LLM errors. We present an analysis that shows that real noise is significantly more challenging than simulated noise, and show that current state-of-the-art models for noise-robust learning fall far short of their theoretically achievable upper bound. We release NoiseBench to the research community.

5/14/2024

cs.CL cs.AI cs.LG