Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery

Read original: arXiv:2303.15975 - Published 8/26/2024 by Mingxuan Liu, Subhankar Roy, Zhun Zhong, Nicu Sebe, Elisa Ricci

🎲

Overview

The paper explores the challenge of "discovering novel concepts in unlabelled datasets" in a continuous manner, which is an important goal for lifelong learning systems.
Previous work has only partially addressed this problem under very restrictive settings, such as by accessing a related labelled dataset or by leveraging a pre-trained model.
This paper proposes a novel learning paradigm that can perform continuous and truly unsupervised class discovery, without relying on any related labelled dataset.
The key idea is to exploit the richer priors from strong self-supervised pre-trained models (PTMs).
The authors propose simple baseline models that use a frozen PTM backbone and a learnable linear classifier, which are easy to implement and resilient to longer learning scenarios.
Extensive empirical evaluation on multiple benchmarks shows the effectiveness of the proposed baselines compared to sophisticated state-of-the-art methods.

Plain English Explanation

The paper tackles the challenge of continual learning – the ability for AI systems to continuously learn new concepts and skills over time, without forgetting what they've learned before. Specifically, the researchers focus on the problem of discovering novel classes in unlabeled datasets in an ongoing, unsupervised manner.

Previous approaches to this problem have been limited, either requiring access to a related labeled dataset or relying on a pre-trained model that was supervised [learn-or-recall-revisiting-incremental-learning-pre]. The authors propose a new approach that can discover novel classes continuously and without any labeled data.

The key insight is to leverage the power of strong self-supervised pre-trained models, which have learned rich representations from large amounts of unlabeled data. The authors design simple baseline models that use these pre-trained models as a fixed "backbone" and learn a new classifier on top in an unsupervised way.

Through extensive testing on various benchmarks, the authors demonstrate that their proposed baselines outperform more complex state-of-the-art methods for continual novel class discovery. The simplicity and robustness of their approach make it a promising direction for building lifelong learning systems that can adapt to new concepts without forgetting the old.

Technical Explanation

The paper proposes a novel learning paradigm for continual novel class discovery (CNCD) – the ability to discover new classes in unlabeled data streams in an unsupervised and ongoing manner. Previous work has only partially addressed this problem under very restrictive settings, such as by accessing a related labeled dataset (e.g., NCD) or by leveraging a supervisedly pre-trained model (e.g., class-iNCD).

The key contribution of this paper is to challenge the status quo in class-iNCD and propose a truly unsupervised CNCD framework that can operate continuously without any related labeled data. The core idea is to exploit the richer priors from strong self-supervised pre-trained models (PTMs). Specifically, the authors propose simple baseline models composed of a frozen PTM backbone and a learnable linear classifier on top.

Through extensive experiments on multiple benchmarks, the authors demonstrate the effectiveness of their proposed baselines compared to sophisticated state-of-the-art methods. The simplicity and resilience of their approach under longer learning scenarios make it a promising direction for building continual learning systems.

Critical Analysis

The paper presents a compelling approach to the important problem of continual novel class discovery. By leveraging the power of self-supervised pre-trained models, the authors are able to achieve strong results using simple baseline architectures, which is an impressive achievement.

One potential limitation of the work is that it relies on the availability of a suitable pre-trained model for the given domain. The authors do not explore the sensitivity of their approach to the choice of PTM or the impact of fine-tuning the PTM during learning. It would be interesting to see how the performance varies under different PTM settings.

Additionally, the paper does not provide much insight into the specific failure modes or limitations of their approach. While the results are strong, it would be valuable to understand the types of scenarios where the baselines might struggle and where further research is needed.

Despite these minor caveats, the core contribution of the paper – demonstrating the power of PTMs for continual novel class discovery – is a significant advancement in the field of lifelong learning. The authors' work encourages further exploration of this direction and highlights the potential for simple yet effective models to tackle complex learning problems.

Conclusion

This paper presents an innovative approach to the challenge of continual novel class discovery, which is a crucial capability for lifelong learning systems. By leveraging the rich representations learned by self-supervised pre-trained models, the authors develop simple baseline models that can perform unsupervised class discovery in an ongoing manner, without relying on any related labeled data.

The empirical results demonstrate the effectiveness of the proposed baselines, which outperform sophisticated state-of-the-art methods. The simplicity and resilience of the approach make it a promising direction for building continual learning systems that can adapt to new concepts over time without forgetting the old.

Overall, this work represents an important step forward in the field of lifelong learning, showcasing the potential of pre-trained models to serve as powerful priors for discovering novel concepts in an unsupervised and continuous fashion. The insights and techniques presented in this paper are likely to inspire further advancements in this area and contribute to the development of more adaptable and capable AI systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🎲

Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery

Mingxuan Liu, Subhankar Roy, Zhun Zhong, Nicu Sebe, Elisa Ricci

Discovering novel concepts in unlabelled datasets and in a continuous manner is an important desideratum of lifelong learners. In the literature such problems have been partially addressed under very restricted settings, where novel classes are learned by jointly accessing a related labelled set (e.g., NCD) or by leveraging only a supervisedly pre-trained model (e.g., class-iNCD). In this work we challenge the status quo in class-iNCD and propose a learning paradigm where class discovery occurs continuously and truly unsupervisedly, without needing any related labelled set. In detail, we propose to exploit the richer priors from strong self-supervised pre-trained models (PTM). To this end, we propose simple baselines, composed of a frozen PTM backbone and a learnable linear classifier, that are not only simple to implement but also resilient under longer learning scenarios. We conduct extensive empirical evaluation on a multitude of benchmarks and show the effectiveness of our proposed baselines when compared with sophisticated state-of-the-art methods. The code is open source.

8/26/2024

🧠

Continual Learning with Pre-Trained Models: A Survey

Da-Wei Zhou, Hai-Long Sun, Jingyi Ning, Han-Jia Ye, De-Chuan Zhan

Nowadays, real-world applications often face streaming data, which requires the learning system to absorb new knowledge as data evolves. Continual Learning (CL) aims to achieve this goal and meanwhile overcome the catastrophic forgetting of former knowledge when learning new ones. Typical CL methods build the model from scratch to grow with incoming data. However, the advent of the pre-trained model (PTM) era has sparked immense research interest, particularly in leveraging PTMs' robust representational capabilities. This paper presents a comprehensive survey of the latest advancements in PTM-based CL. We categorize existing methodologies into three distinct groups, providing a comparative analysis of their similarities, differences, and respective advantages and disadvantages. Additionally, we offer an empirical study contrasting various state-of-the-art methods to highlight concerns regarding fairness in comparisons. The source code to reproduce these evaluations is available at: https://github.com/sun-hailong/LAMDA-PILOT

4/24/2024

Exploiting Fine-Grained Prototype Distribution for Boosting Unsupervised Class Incremental Learning

Jiaming Liu, Hongyuan Liu, Zhili Qin, Wei Han, Yulu Fan, Qinli Yang, Junming Shao

The dynamic nature of open-world scenarios has attracted more attention to class incremental learning (CIL). However, existing CIL methods typically presume the availability of complete ground-truth labels throughout the training process, an assumption rarely met in practical applications. Consequently, this paper explores a more challenging problem of unsupervised class incremental learning (UCIL). The essence of addressing this problem lies in effectively capturing comprehensive feature representations and discovering unknown novel classes. To achieve this, we first model the knowledge of class distribution by exploiting fine-grained prototypes. Subsequently, a granularity alignment technique is introduced to enhance the unsupervised class discovery. Additionally, we proposed a strategy to minimize overlap between novel and existing classes, thereby preserving historical knowledge and mitigating the phenomenon of catastrophic forgetting. Extensive experiments on the five datasets demonstrate that our approach significantly outperforms current state-of-the-art methods, indicating the effectiveness of the proposed method.

8/20/2024

✅

Do Pre-trained Models Benefit Equally in Continual Learning?

Kuan-Ying Lee, Yuanyi Zhong, Yu-Xiong Wang

Existing work on continual learning (CL) is primarily devoted to developing algorithms for models trained from scratch. Despite their encouraging performance on contrived benchmarks, these algorithms show dramatic performance drops in real-world scenarios. Therefore, this paper advocates the systematic introduction of pre-training to CL, which is a general recipe for transferring knowledge to downstream tasks but is substantially missing in the CL community. Our investigation reveals the multifaceted complexity of exploiting pre-trained models for CL, along three different axes, pre-trained models, CL algorithms, and CL scenarios. Perhaps most intriguingly, improvements in CL algorithms from pre-training are very inconsistent an underperforming algorithm could become competitive and even state-of-the-art when all algorithms start from a pre-trained model. This indicates that the current paradigm, where all CL methods are compared in from-scratch training, is not well reflective of the true CL objective and desired progress. In addition, we make several other important observations, including that CL algorithms that exert less regularization benefit more from a pre-trained model; and that a stronger pre-trained model such as CLIP does not guarantee a better improvement. Based on these findings, we introduce a simple yet effective baseline that employs minimum regularization and leverages the more beneficial pre-trained model, coupled with a two-stage training pipeline. We recommend including this strong baseline in the future development of CL algorithms, due to its demonstrated state-of-the-art performance.

7/8/2024