On the Limitations of General Purpose Domain Generalisation Methods

2202.00563

Published 5/24/2024 by Henry Gouk, Ondrej Bohdal, Da Li, Timothy Hospedales

🤯

Abstract

We investigate the fundamental performance limitations of learning algorithms in several Domain Generalisation (DG) settings. Motivated by the difficulty with which previously proposed methods have in reliably outperforming Empirical Risk Minimisation (ERM), we derive upper bounds on the excess risk of ERM, and lower bounds on the minimax excess risk. Our findings show that in all the DG settings we consider, it is not possible to significantly outperform ERM. Our conclusions are limited not only to the standard covariate shift setting, but also two other settings with additional restrictions on how domains can differ. The first constrains all domains to have a non-trivial bound on pairwise distances, as measured by a broad class of integral probability metrics. The second alternate setting considers a restricted class of DG problems where all domains have the same underlying support. Our analysis also suggests how different strategies can be used to optimise the performance of ERM in each of these DG setting. We also experimentally explore hypotheses suggested by our theoretical analysis.

Create account to get full access

Overview

The paper investigates the fundamental performance limitations of learning algorithms in several Domain Generalisation (DG) settings.
The authors derive upper bounds on the excess risk of Empirical Risk Minimisation (ERM) and lower bounds on the minimax excess risk.
Their findings show that in all the DG settings considered, it is not possible to significantly outperform ERM.
The analysis covers the standard covariate shift setting, as well as two other settings with additional restrictions on how domains can differ.

Plain English Explanation

The researchers looked at how well machine learning models can perform when they need to work with data from different situations or "domains." This is called domain generalisation (DG). They found that a simple technique called Empirical Risk Minimisation (ERM) is very hard to beat in these DG settings.

ERM is a straightforward way to train machine learning models - it just tries to minimise the error on the training data. The researchers showed that even though more complex DG methods have been proposed, they can't significantly outperform this simple ERM approach. This was true not only for the standard DG setting, but also for two other more restricted DG settings they considered.

In one of these, all the domains had to be somewhat similar, as measured by a broad class of statistical tests. In the other, all the domains had to have the same underlying data distribution, just shifted around. Even with these extra constraints, the researchers found it's very difficult to do better than ERM.

Their analysis also suggests ways to optimise the performance of ERM in each of these DG settings. And they did some experiments to test the ideas from their theoretical analysis.

Technical Explanation

The paper investigates the fundamental performance limitations of learning algorithms in several Domain Generalisation (DG) settings. Motivated by the difficulty with which previously proposed DG methods have in reliably outperforming Empirical Risk Minimisation (ERM), the authors derive upper bounds on the excess risk of ERM, and lower bounds on the minimax excess risk.

Their findings show that in all the DG settings considered, it is not possible to significantly outperform ERM. This includes the standard covariate shift setting, as well as two other settings with additional restrictions:

A setting where all domains have a non-trivial bound on pairwise distances, as measured by a broad class of integral probability metrics.
A setting where all domains have the same underlying support.

The paper's analysis also suggests strategies to optimise the performance of ERM in each of these DG settings. The authors also experimentally explore hypotheses suggested by their theoretical analysis.

Critical Analysis

The paper provides a rigorous theoretical analysis of the fundamental limitations of learning algorithms in several DG settings. By deriving upper and lower bounds, the authors demonstrate that outperforming ERM is extremely challenging, even with additional restrictions on the problem.

One potential limitation is that the analysis may not capture all the nuances of real-world DG problems, which can be highly complex and diverse. The authors acknowledge that their conclusions are limited to the specific settings they consider, and more research is needed to understand the broader implications.

Additionally, the paper focuses on excess risk as the primary performance metric. While this is a common and important measure, there may be other relevant criteria, such as robustness or sample efficiency, that are not explored in depth.

Nevertheless, the paper makes a significant contribution to the understanding of DG and the capabilities of different learning algorithms. The insights and techniques developed here could inform the design of more effective DG methods in the future.

Conclusion

This paper provides important insights into the fundamental limitations of learning algorithms in Domain Generalisation (DG) settings. The authors show that it is extremely difficult to significantly outperform the simple Empirical Risk Minimisation (ERM) approach, even when additional restrictions are placed on the problem.

These findings challenge the notion that more complex DG methods are necessary to achieve good performance. Instead, the paper suggests that optimising ERM may be a more promising avenue for improving DG capabilities. The theoretical analysis and experimental results offer a deeper understanding of DG and could guide future research in this area.

Overall, this work highlights the importance of understanding the inherent limitations of machine learning algorithms, especially when applied to challenging real-world problems like domain generalisation. By acknowledging these limitations, researchers can develop more realistic expectations and focus their efforts on the most promising directions for advancement.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Domain Generalisation via Imprecise Learning

Anurag Singh, Siu Lun Chau, Shahine Bouabid, Krikamol Muandet

Out-of-distribution (OOD) generalisation is challenging because it involves not only learning from empirical data, but also deciding among various notions of generalisation, e.g., optimising the average-case risk, worst-case risk, or interpolations thereof. While this choice should in principle be made by the model operator like medical doctors, this information might not always be available at training time. The institutional separation between machine learners and model operators leads to arbitrary commitments to specific generalisation strategies by machine learners due to these deployment uncertainties. We introduce the Imprecise Domain Generalisation framework to mitigate this, featuring an imprecise risk optimisation that allows learners to stay imprecise by optimising against a continuous spectrum of generalisation strategies during training, and a model framework that allows operators to specify their generalisation preference at deployment. Supported by both theoretical and empirical evidence, our work showcases the benefits of integrating imprecision into domain generalisation.

5/31/2024

cs.LG

Towards a Better Evaluation of Out-of-Domain Generalization

Duhun Hwang, Suhyun Kang, Moonjung Eo, Jimyeong Kim, Wonjong Rhee

The objective of Domain Generalization (DG) is to devise algorithms and models capable of achieving high performance on previously unseen test distributions. In the pursuit of this objective, average measure has been employed as the prevalent measure for evaluating models and comparing algorithms in the existing DG studies. Despite its significance, a comprehensive exploration of the average measure has been lacking and its suitability in approximating the true domain generalization performance has been questionable. In this study, we carefully investigate the limitations inherent in the average measure and propose worst+gap measure as a robust alternative. We establish theoretical grounds of the proposed measure by deriving two theorems starting from two different assumptions. We conduct extensive experimental investigations to compare the proposed worst+gap measure with the conventional average measure. Given the indispensable need to access the true DG performance for studying measures, we modify five existing datasets to come up with SR-CMNIST, C-Cats&Dogs, L-CIFAR10, PACS-corrupted, and VLCS-corrupted datasets. The experiment results unveil an inferior performance of the average measure in approximating the true DG performance and confirm the robustness of the theoretically supported worst+gap measure.

6/4/2024

cs.LG cs.CV stat.ML

Domain Agnostic Conditional Invariant Predictions for Domain Generalization

Zongbin Wang, Bin Pan, Zhenwei Shi

Domain generalization aims to develop a model that can perform well on unseen target domains by learning from multiple source domains. However, recent-proposed domain generalization models usually rely on domain labels, which may not be available in many real-world scenarios. To address this challenge, we propose a Discriminant Risk Minimization (DRM) theory and the corresponding algorithm to capture the invariant features without domain labels. In DRM theory, we prove that reducing the discrepancy of prediction distribution between overall source domain and any subset of it can contribute to obtaining invariant features. To apply the DRM theory, we develop an algorithm which is composed of Bayesian inference and a new penalty termed as Categorical Discriminant Risk (CDR). In Bayesian inference, we transform the output of the model into a probability distribution to align with our theoretical assumptions. We adopt sliding update approach to approximate the overall prediction distribution of the model, which enables us to obtain CDR penalty. We also indicate the effectiveness of these components in finding invariant features. We evaluate our algorithm against various domain generalization methods on multiple real-world datasets, providing empirical support for our theory.

6/11/2024

cs.LG

🧠

The Power of Sampling: Dimension-free Risk Bounds in Private ERM

Yin Tat Lee, Daogao Liu, Zhou Lu

Differentially private empirical risk minimization (DP-ERM) is a fundamental problem in private optimization. While the theory of DP-ERM is well-studied, as large-scale models become prevalent, traditional DP-ERM methods face new challenges, including (1) the prohibitive dependence on the ambient dimension, (2) the highly non-smooth objective functions, (3) costly first-order gradient oracles. Such challenges demand rethinking existing DP-ERM methodologies. In this work, we show that the regularized exponential mechanism combined with existing samplers can address these challenges altogether: under the standard unconstrained domain and low-rank gradients assumptions, our algorithm can achieve rank-dependent risk bounds for non-smooth convex objectives using only zeroth order oracles, which was not accomplished by prior methods. This highlights the power of sampling in differential privacy. We further construct lower bounds, demonstrating that when gradients are full-rank, there is no separation between the constrained and unconstrained settings. Our lower bound is derived from a general black-box reduction from unconstrained to the constrained domain and an improved lower bound in the constrained setting, which might be of independent interest.

6/5/2024

cs.LG cs.CR