What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement

Read original: arXiv:2402.01865 - Published 6/21/2024 by Xisen Jin, Xiang Ren

What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement

Overview

This paper explores the challenge of "forgetting" in language model refinement, where fine-tuning a pre-trained model on new data can cause it to lose knowledge of previous examples.
The researchers propose a method to forecast which examples a language model is likely to forget when fine-tuned on new data, which could help mitigate this issue.
The paper presents an experimental study demonstrating the effectiveness of this forecasting approach and its potential applications in continual learning and preventing catastrophic forgetting.

Plain English Explanation

When training AI language models, there is a common problem called "forgetting." This happens when the model is fine-tuned on new data, causing it to lose knowledge of previous examples it was trained on. The researchers in this paper have come up with a way to predict which examples the model is likely to forget during this fine-tuning process.

Imagine you're teaching someone new skills, but in the process, they start to forget things they previously knew. The researchers' method is like being able to forecast which old skills the person is most likely to forget, so you can take steps to help them remember those important things.

By being able to forecast forgetting, the researchers believe this could lead to better ways of continuously updating and improving language models without them losing critical knowledge. This could have applications in areas like continual learning, where models need to adapt to new information over time without catastrophically forgetting what they already know.

Technical Explanation

The key innovation in this paper is a method for forecasting which examples a language model is likely to forget when fine-tuned on new data. The researchers develop a statistical model that can predict the probability of a model forgetting a particular example based on factors like the example's difficulty, the model's previous performance on it, and the model's overall capacity.

They test this forecasting approach through a series of experiments, where they fine-tune language models on new data and then measure how well the forecasting model can predict which examples from the original training data the fine-tuned model will forget. The results show that this forecasting method is quite effective, outperforming simpler baselines.

The researchers also demonstrate how this forgetting forecast could be used to mitigate catastrophic forgetting in continual learning scenarios, by selectively rehearsing or protecting the most vulnerable examples. This builds on prior work in adaptive memory replay and controlling forgetting during fine-tuning.

Critical Analysis

One limitation of this work is that the forecasting model relies on having access to detailed information about the model's performance on individual training examples, which may not always be available in real-world settings. Additionally, the experiments are conducted on relatively small language models and datasets, so further research is needed to understand how well the forecasting approach scales to larger, more complex models.

The paper also does not deeply explore the fundamental reasons why certain examples are more vulnerable to forgetting than others during fine-tuning. A better understanding of the underlying mechanisms driving forgetting could lead to more principled approaches for mitigating it.

Overall, this paper presents a promising step towards better managing the forgetting problem in language model refinement. By being able to forecast which examples are at risk of being forgotten, researchers and practitioners may be able to develop more effective strategies for continual learning and preventing catastrophic forgetting in AI systems.

Conclusion

This paper tackles the important challenge of "forgetting" in language model refinement, where fine-tuning on new data can cause a model to lose knowledge of previous examples. The researchers propose a method to forecast which examples a model is likely to forget, which could enable more effective strategies for mitigating this issue.

The experimental results demonstrate the effectiveness of this forecasting approach, and the researchers suggest potential applications in continual learning and preventing catastrophic forgetting. While there are some limitations to address, this work represents an important step towards more robust and adaptable language models that can continuously learn and improve without losing critical knowledge.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

What Will My Model Forget? Forecasting Forgotten Examples in Language Model Refinement

Xisen Jin, Xiang Ren

Language models deployed in the wild make errors. However, simply updating the model with the corrected error instances causes catastrophic forgetting -- the updated model makes errors on instances learned during the instruction tuning or upstream training phase. Randomly replaying upstream data yields unsatisfactory performance and often comes with high variance and poor controllability. To this end, we try to forecast upstream examples that will be forgotten due to a model update for improved controllability of the replay process and interpretability. We train forecasting models given a collection of online learned examples and corresponding forgotten upstream pre-training examples. We propose a partially interpretable forecasting model based on the observation that changes in pre-softmax logit scores of pretraining examples resemble that of online learned examples, which performs decently on BART but fails on T5 models. We further show a black-box classifier based on inner products of example representations achieves better forecasting performance over a series of setups. Finally, we show that we reduce forgetting of upstream pretraining examples by replaying examples that are forecasted to be forgotten, demonstrating the practical utility of forecasting example forgetting.

6/21/2024

Demystifying Forgetting in Language Model Fine-Tuning with Statistical Analysis of Example Associations

Xisen Jin, Xiang Ren

Language models (LMs) are known to suffer from forgetting of previously learned examples when fine-tuned, breaking stability of deployed LM systems. Despite efforts on mitigating forgetting, few have investigated whether, and how forgotten upstream examples are associated with newly learned tasks. Insights on such associations enable efficient and targeted mitigation of forgetting. In this paper, we empirically analyze forgetting that occurs in $N$ upstream examples while the model learns $M$ new tasks and visualize their associations with a $M times N$ matrix. We empirically demonstrate that the degree of forgetting can often be approximated by simple multiplicative contributions of the upstream examples and newly learned tasks. We also reveal more complicated patterns where specific subsets of examples are forgotten with statistics and visualization. Following our analysis, we predict forgetting that happens on upstream examples when learning a new task with matrix completion over the empirical associations, outperforming prior approaches that rely on trainable LMs. Project website: https://inklab.usc.edu/lm-forgetting-prediction/

6/21/2024

Unforgettable Generalization in Language Models

Eric Zhang, Leshem Chosen, Jacob Andreas

When language models (LMs) are trained to forget (or unlearn'') a skill, how precisely does their behavior change? We study the behavior of transformer LMs in which tasks have been forgotten via fine-tuning on randomized labels. Such LMs learn to generate near-random predictions for individual examples in the training'' set used for forgetting. Across tasks, however, LMs exhibit extreme variability in whether LM predictions change on examples outside the training set. In some tasks (like entailment classification), forgetting generalizes robustly, and causes models to produce uninformative predictions on new task instances; in other tasks (like physical commonsense reasoning and scientific question answering) forgetting affects only the training examples, and models continue to perform the forgotten'' task accurately even for examples very similar to those that appeared in the training set. Dataset difficulty is not predictive of whether a behavior can be forgotten; instead, generalization in forgetting is (weakly) predicted by the confidence of LMs' initial task predictions and the variability of LM representations of training data, with low confidence and low variability both associated with greater generalization. Perhaps most surprisingly, random-label forgetting appears to be somewhat insensitive to the contents of the training set: for example, models trained on science questions with random labels continue to answer other science questions accurately, but begin to produce random labels on entailment classification tasks. Finally, we show that even generalizable forgetting is shallow: linear probes trained on LMs' representations can still perform tasks reliably after forgetting. Our results highlight the difficulty and unpredictability of performing targeted skill removal from models via fine-tuning.

9/5/2024

Controlling Forgetting with Test-Time Data in Continual Learning

Vaibhav Singh, Rahaf Aljundi, Eugene Belilovsky

Foundational vision-language models have shown impressive performance on various downstream tasks. Yet, there is still a pressing need to update these models later as new tasks or domains become available. Ongoing Continual Learning (CL) research provides techniques to overcome catastrophic forgetting of previous information when new knowledge is acquired. To date, CL techniques focus only on the supervised training sessions. This results in significant forgetting yielding inferior performance to even the prior model zero shot performance. In this work, we argue that test-time data hold great information that can be leveraged in a self supervised manner to refresh the model's memory of previous learned tasks and hence greatly reduce forgetting at no extra labelling cost. We study how unsupervised data can be employed online to improve models' performance on prior tasks upon encountering representative samples. We propose a simple yet effective student-teacher model with gradient based sparse parameters updates and show significant performance improvements and reduction in forgetting, which could alleviate the role of an offline episodic memory/experience replay buffer.

6/21/2024