Mitigating Spurious Correlations for Self-supervised Recommendation

Read original: arXiv:2212.04282 - Published 4/19/2024 by Xinyu Lin, Yiyan Xu, Wenjie Wang, Yang Zhang, Fuli Feng

🌿

Overview

Self-supervised learning (SSL) has been very successful in recommendation systems, but SSL models can suffer from spurious correlations that lead to poor generalization.
Existing methods to mitigate spurious correlations either sacrifice the benefits of invariant features or require expensive human labeling.
This paper proposes a framework to automatically identify and remove spurious features, and block their negative impact during SSL.

Plain English Explanation

Recommendation systems, which suggest products or content that users might like, have seen great improvements thanks to a technique called self-supervised learning (SSL). SSL recommender models can learn useful patterns from data without needing a lot of manual labeling.

However, these SSL models can sometimes latch onto "spurious correlations" - patterns in the data that seem predictive but don't actually reflect the true underlying relationships. This can cause the models to generalize poorly and make bad recommendations, especially when the data changes over time or across different environments.

Existing approaches to fixing this issue either try to identify and remove the spurious features manually (which is labor-intensive) or focus only on the features that are consistent across different environments (which can sacrifice useful information).

This paper proposes a new framework that can automatically identify and block the influence of spurious features, without losing the benefits of the genuinely informative features. The key ideas are:

Divide the user-item interaction data into multiple "environments" with distribution shifts, to expose spurious correlations.
Learn a "feature mask" that can selectively block the influence of spurious features, while preserving the impact of invariant features.
Use this mask to remove spurious features from the model's inputs, and also to guide "feature augmentation" during self-supervised training, further reducing the model's reliance on spurious signals.

Experiments show this framework can effectively mitigate spurious correlations and improve the generalization of SSL recommendation models.

Technical Explanation

The proposed framework, called Invariant Feature Learning (IFL), addresses the spurious correlation problem in SSL recommender models in two key ways:

Automatic Spurious Feature Masking: IFL first divides the user-item interaction data into multiple "environments" with distribution shifts. This exposes features that are spuriously correlated with the target in one environment but not others. IFL then learns a "feature mask" that can selectively block the influence of these spurious features, while preserving the impact of features that are invariant across environments.
Mask-Guided Feature Augmentation: Beyond just removing spurious features, IFL uses the learned feature mask to guide "feature augmentation" during the self-supervised training process. This helps further reduce the model's reliance on spurious signals by selectively enhancing the invariant features.

The IFL framework is evaluated on two real-world datasets, demonstrating its effectiveness in mitigating spurious correlations and improving the generalization of SSL recommender models, compared to other approaches like feature engineering and environment-invariant SSL.

Critical Analysis

The IFL framework is a promising approach to addressing the spurious correlation problem in SSL recommender models. By automatically identifying and blocking spurious features, it can help these models generalize better to new data distributions without requiring expensive human labeling.

However, the paper does not explore the impact of different ways of constructing the "environments" used to expose spurious correlations. The choice of environment definition could significantly affect the performance of the IFL framework, and further research is needed to understand the best practices.

Additionally, the paper focuses on recommender systems, but the core ideas of IFL could potentially be applied to other domains with self-supervised learning. Further research is needed to explore the generalizability of the approach to other types of SSL models and applications.

Overall, the IFL framework represents an important step forward in building more robust and generalizable self-supervised learning systems. By addressing the spurious correlation issue, it can help unlock the full potential of SSL in real-world applications.

Conclusion

This paper proposes an Invariant Feature Learning (IFL) framework to mitigate the impact of spurious correlations in self-supervised learning (SSL) recommender models. IFL automatically identifies and blocks the influence of spurious features, while preserving the benefits of invariant features. Experiments show IFL can effectively improve the generalization of SSL recommender models compared to existing approaches.

The key contributions of IFL are its ability to automatically mask spurious features without supervision, and its use of the feature mask to guide self-supervised training and further reduce reliance on spurious signals. These advancements represent an important step towards building more robust and reliable SSL-based recommendation systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌿

Mitigating Spurious Correlations for Self-supervised Recommendation

Xinyu Lin, Yiyan Xu, Wenjie Wang, Yang Zhang, Fuli Feng

Recent years have witnessed the great success of self-supervised learning (SSL) in recommendation systems. However, SSL recommender models are likely to suffer from spurious correlations, leading to poor generalization. To mitigate spurious correlations, existing work usually pursues ID-based SSL recommendation or utilizes feature engineering to identify spurious features. Nevertheless, ID-based SSL approaches sacrifice the positive impact of invariant features, while feature engineering methods require high-cost human labeling. To address the problems, we aim to automatically mitigate the effect of spurious correlations. This objective requires to 1) automatically mask spurious features without supervision, and 2) block the negative effect transmission from spurious features to other features during SSL. To handle the two challenges, we propose an invariant feature learning framework, which first divides user-item interactions into multiple environments with distribution shifts and then learns a feature mask mechanism to capture invariant features across environments. Based on the mask mechanism, we can remove the spurious features for robust predictions and block the negative effect transmission via mask-guided feature augmentation. Extensive experiments on two datasets demonstrate the effectiveness of the proposed framework in mitigating spurious correlations and improving the generalization abilities of SSL models. The code is available at https://github.com/Linxyhaha/IFL.

4/19/2024

Views Can Be Deceiving: Improved SSL Through Feature Space Augmentation

Kimia Hamidieh, Haoran Zhang, Swami Sankaranarayanan, Marzyeh Ghassemi

Supervised learning methods have been found to exhibit inductive biases favoring simpler features. When such features are spuriously correlated with the label, this can result in suboptimal performance on minority subgroups. Despite the growing popularity of methods which learn from unlabeled data, the extent to which these representations rely on spurious features for prediction is unclear. In this work, we explore the impact of spurious features on Self-Supervised Learning (SSL) for visual representation learning. We first empirically show that commonly used augmentations in SSL can cause undesired invariances in the image space, and illustrate this with a simple example. We further show that classical approaches in combating spurious correlations, such as dataset re-sampling during SSL, do not consistently lead to invariant representations. Motivated by these findings, we propose LateTVG to remove spurious information from these representations during pre-training, by regularizing later layers of the encoder via pruning. We find that our method produces representations which outperform the baselines on several benchmarks, without the need for group or label information during SSL.

6/28/2024

Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation

Guangtao Zheng, Wenqian Ye, Aidong Zhang

Deep neural classifiers tend to rely on spurious correlations between spurious attributes of inputs and targets to make predictions, which could jeopardize their generalization capability. Training classifiers robust to spurious correlations typically relies on annotations of spurious correlations in data, which are often expensive to get. In this paper, we tackle an annotation-free setting and propose a self-guided spurious correlation mitigation framework. Our framework automatically constructs fine-grained training labels tailored for a classifier obtained with empirical risk minimization to improve its robustness against spurious correlations. The fine-grained training labels are formulated with different prediction behaviors of the classifier identified in a novel spuriousness embedding space. We construct the space with automatically detected conceptual attributes and a novel spuriousness metric which measures how likely a class-attribute correlation is exploited for predictions. We demonstrate that training the classifier to distinguish different prediction behaviors reduces its reliance on spurious correlations without knowing them a priori and outperforms prior methods on five real-world datasets.

5/7/2024

Out of spuriousity: Improving robustness to spurious correlations without group annotations

Phuong Quynh Le, Jorg Schlotterer, Christin Seifert

Machine learning models are known to learn spurious correlations, i.e., features having strong relations with class labels but no causal relation. Relying on those correlations leads to poor performance in the data groups without these correlations and poor generalization ability. To improve the robustness of machine learning models to spurious correlations, we propose an approach to extract a subnetwork from a fully trained network that does not rely on spurious correlations. The subnetwork is found by the assumption that data points with the same spurious attribute will be close to each other in the representation space when training with ERM, then we employ supervised contrastive loss in a novel way to force models to unlearn the spurious connections. The increase in the worst-group performance of our approach contributes to strengthening the hypothesis that there exists a subnetwork in a fully trained dense network that is responsible for using only invariant features in classification tasks, therefore erasing the influence of spurious features even in the setup of multi spurious attributes and no prior knowledge of attributes labels.

7/23/2024