Non-negative Contrastive Learning

Read original: arXiv:2403.12459 - Published 4/24/2024 by Yifei Wang, Qi Zhang, Yaoyu Guo, Yisen Wang

Overview

This paper introduces a new approach called "Non-negative Contrastive Learning" (NCL) that aims to address limitations in existing contrastive learning techniques.
Contrastive learning is a popular unsupervised representation learning method that learns useful representations by contrasting positive and negative examples.
The paper highlights issues with representation symmetry in contrastive learning and proposes NCL as a solution to learn asymmetric and non-negative representations.

Plain English Explanation

Contrastive learning is a technique used in machine learning to learn useful representations of data in an unsupervised way. It works by comparing "positive" examples (similar things) to "negative" examples (dissimilar things) and trying to learn representations that can distinguish between them.

However, the paper argues that existing contrastive learning approaches have issues with "representation symmetry" - the representations they learn can be positive or negative, which may not always be desirable. The authors propose a new approach called "Non-negative Contrastive Learning" (NCL) that learns representations that are always non-negative (i.e. positive).

The key idea behind NCL is to constrain the representations to be non-negative, which can provide some advantages over standard contrastive learning. For example, non-negative matrix factorization (NMF) has been shown to learn more interpretable features, and non-negative subspace feature representation can be useful for few-shot learning tasks.

The paper demonstrates how NCL can be applied and shows that it outperforms standard contrastive learning on several benchmark tasks. The authors also provide theoretical analysis to explain why NCL can be advantageous in certain scenarios.

Technical Explanation

The paper proposes a new contrastive learning framework called "Non-negative Contrastive Learning" (NCL) that learns non-negative and asymmetric representations.

The key limitation of standard contrastive learning approaches that the paper identifies is "representation symmetry" - the representations learned can be positive or negative, which may not always be desirable. For example, clinical-oriented multi-level contrastive learning has shown the benefits of non-negative representations in healthcare applications.

To address this, the authors introduce NCL, which constrains the learned representations to be non-negative. This is achieved by modifying the contrastive loss function to include a non-negativity constraint. Theoretically, the authors show that this leads to asymmetric and sparse representations.

Experiments on several benchmark datasets demonstrate that NCL outperforms standard contrastive learning approaches. The authors also show the benefits of NCL in scenarios with noisy data and for long-tailed and multi-label classification.

Critical Analysis

The paper presents a well-motivated and technically sound approach to address the representation symmetry issue in contrastive learning. The non-negative and asymmetric representations learned by NCL can potentially be more interpretable and beneficial in certain applications.

However, the paper does not explore the limitations of NCL in depth. For example, the non-negativity constraint may limit the representation power of the model in some cases, and the authors do not discuss how to choose the appropriate level of non-negativity regularization.

Additionally, the paper focuses on standard contrastive learning approaches and does not compare NCL to other techniques that can also learn asymmetric representations, such as topic modeling or non-negative matrix factorization. A more comprehensive comparison would help better understand the strengths and weaknesses of NCL.

Overall, the paper presents a promising new approach, but further research is needed to fully evaluate the practical implications and understand the limitations of NCL.

Conclusion

This paper introduces a new contrastive learning framework called "Non-negative Contrastive Learning" (NCL) that learns non-negative and asymmetric representations. The key idea is to constrain the learned representations to be non-negative, which can provide advantages over standard contrastive learning approaches in terms of interpretability and performance on certain tasks.

The experimental results demonstrate the effectiveness of NCL, and the theoretical analysis provides insight into why non-negative representations can be beneficial. While the paper presents a well-motivated and technically sound approach, further research is needed to fully understand the limitations and broader applicability of NCL.

Overall, this work contributes a novel perspective to the growing field of contrastive learning and represents a step towards developing more interpretable and task-specific representation learning techniques.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Non-negative Contrastive Learning

Yifei Wang, Qi Zhang, Yaoyu Guo, Yisen Wang

Deep representations have shown promising performance when transferred to downstream tasks in a black-box manner. Yet, their inherent lack of interpretability remains a significant challenge, as these features are often opaque to human understanding. In this paper, we propose Non-negative Contrastive Learning (NCL), a renaissance of Non-negative Matrix Factorization (NMF) aimed at deriving interpretable features. The power of NCL lies in its enforcement of non-negativity constraints on features, reminiscent of NMF's capability to extract features that align closely with sample clusters. NCL not only aligns mathematically well with an NMF objective but also preserves NMF's interpretability attributes, resulting in a more sparse and disentangled representation compared to standard contrastive learning (CL). Theoretically, we establish guarantees on the identifiability and downstream generalization of NCL. Empirically, we show that these advantages enable NCL to outperform CL significantly on feature disentanglement, feature selection, as well as downstream classification tasks. At last, we show that NCL can be easily extended to other learning scenarios and benefit supervised learning as well. Code is available at https://github.com/PKU-ML/non_neg.

4/24/2024

Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning

Jihai Zhang, Xiang Lan, Xiaoye Qu, Yu Cheng, Mengling Feng, Bryan Hooi

Self-Supervised Contrastive Learning has proven effective in deriving high-quality representations from unlabeled data. However, a major challenge that hinders both unimodal and multimodal contrastive learning is feature suppression, a phenomenon where the trained model captures only a limited portion of the information from the input data while overlooking other potentially valuable content. This issue often leads to indistinguishable representations for visually similar but semantically different inputs, adversely affecting downstream task performance, particularly those requiring rigorous semantic comprehension. To address this challenge, we propose a novel model-agnostic Multistage Contrastive Learning (MCL) framework. Unlike standard contrastive learning which inherently captures one single biased feature distribution, MCL progressively learns previously unlearned features through feature-aware negative sampling at each stage, where the negative samples of an anchor are exclusively selected from the cluster it was assigned to in preceding stages. Meanwhile, MCL preserves the previously well-learned features by cross-stage representation integration, integrating features across all stages to form final representations. Our comprehensive evaluation demonstrates MCL's effectiveness and superiority across both unimodal and multimodal contrastive learning, spanning a range of model architectures from ResNet to Vision Transformers (ViT). Remarkably, in tasks where the original CLIP model has shown limitations, MCL dramatically enhances performance, with improvements up to threefold on specific attributes in the recently proposed MMVP benchmark.

7/16/2024

DN-CL: Deep Symbolic Regression against Noise via Contrastive Learning

Jingyi Liu, Yanjie Li, Lina Yu, Min Wu, Weijun Li, Wenqiang Li, Meilan Hao, Yusong Deng, Shu Wei

Noise ubiquitously exists in signals due to numerous factors including physical, electronic, and environmental effects. Traditional methods of symbolic regression, such as genetic programming or deep learning models, aim to find the most fitting expressions for these signals. However, these methods often overlook the noise present in real-world data, leading to reduced fitting accuracy. To tackle this issue, we propose textit{textbf{D}eep Symbolic Regression against textbf{N}oise via textbf{C}ontrastive textbf{L}earning (DN-CL)}. DN-CL employs two parameter-sharing encoders to embed data points from various data transformations into feature shields against noise. This model treats noisy data and clean data as different views of the ground-truth mathematical expressions. Distances between these features are minimized, utilizing contrastive learning to distinguish between 'positive' noise-corrected pairs and 'negative' contrasting pairs. Our experiments indicate that DN-CL demonstrates superior performance in handling both noisy and clean data, presenting a promising method of symbolic regression.

6/24/2024

Contrastive Factor Analysis

Zhibin Duan, Tiansheng Wen, Yifei Wang, Chen Zhu, Bo Chen, Mingyuan Zhou

Factor analysis, often regarded as a Bayesian variant of matrix factorization, offers superior capabilities in capturing uncertainty, modeling complex dependencies, and ensuring robustness. As the deep learning era arrives, factor analysis is receiving less and less attention due to their limited expressive ability. On the contrary, contrastive learning has emerged as a potent technique with demonstrated efficacy in unsupervised representational learning. While the two methods are different paradigms, recent theoretical analysis has revealed the mathematical equivalence between contrastive learning and matrix factorization, providing a potential possibility for factor analysis combined with contrastive learning. Motivated by the interconnectedness of contrastive learning, matrix factorization, and factor analysis, this paper introduces a novel Contrastive Factor Analysis framework, aiming to leverage factor analysis's advantageous properties within the realm of contrastive learning. To further leverage the interpretability properties of non-negative factor analysis, which can learn disentangled representations, contrastive factor analysis is extended to a non-negative version. Finally, extensive experimental validation showcases the efficacy of the proposed contrastive (non-negative) factor analysis methodology across multiple key properties, including expressiveness, robustness, interpretability, and accurate uncertainty estimation.

8/2/2024