Multi-Objective Hyperparameter Optimization in Machine Learning -- An Overview

2206.07438

Published 6/7/2024 by Florian Karl, Tobias Pielok, Julia Moosbauer, Florian Pfisterer, Stefan Coors, Martin Binder, Lennart Schneider, Janek Thomas, Jakob Richter, Michel Lang and 3 others

cs.LG stat.ML

🛠️

Abstract

Hyperparameter optimization constitutes a large part of typical modern machine learning workflows. This arises from the fact that machine learning methods and corresponding preprocessing steps often only yield optimal performance when hyperparameters are properly tuned. But in many applications, we are not only interested in optimizing ML pipelines solely for predictive accuracy; additional metrics or constraints must be considered when determining an optimal configuration, resulting in a multi-objective optimization problem. This is often neglected in practice, due to a lack of knowledge and readily available software implementations for multi-objective hyperparameter optimization. In this work, we introduce the reader to the basics of multi-objective hyperparameter optimization and motivate its usefulness in applied ML. Furthermore, we provide an extensive survey of existing optimization strategies, both from the domain of evolutionary algorithms and Bayesian optimization. We illustrate the utility of MOO in several specific ML applications, considering objectives such as operating conditions, prediction time, sparseness, fairness, interpretability and robustness.

Create account to get full access

Overview

Hyperparameter optimization is a crucial step in modern machine learning workflows, as it allows models to achieve optimal performance.
However, in many applications, predictive accuracy is not the sole objective. Other factors like operating conditions, prediction time, sparseness, fairness, interpretability, and robustness must also be considered.
This paper introduces the basics of multi-objective hyperparameter optimization and highlights its usefulness in applied machine learning.
The authors provide a comprehensive survey of existing optimization strategies, covering both evolutionary algorithms and Bayesian optimization.
They illustrate the utility of multi-objective optimization in several specific machine learning applications.

Plain English Explanation

Hyperparameters are the knobs and dials that machine learning models have, which need to be adjusted to get the best performance. This is a crucial step in developing machine learning systems, as getting the hyperparameters right can make a big difference in how well the model works.

However, in many real-world applications, we don't just care about the model's predictive accuracy. We might also want the model to be fast, interpretable, fair, or robust to changes in the data. These are all important factors that need to be balanced when tuning the hyperparameters.

This paper explains the basics of multi-objective optimization, which is a way to find the best hyperparameter settings when you have multiple, potentially conflicting, objectives. The authors review different optimization strategies that can be used, from evolutionary algorithms to Bayesian optimization. They also provide examples of how multi-objective optimization can be applied in various machine learning scenarios, such as optimizing for speed, sparseness, fairness, and robustness.

The key idea is that by considering multiple objectives, you can find hyperparameter settings that strike the right balance between different desirable properties of the machine learning system, rather than just focusing on accuracy alone. This can lead to more practical and useful models in many applications.

Technical Explanation

The paper begins by highlighting the importance of hyperparameter optimization in modern machine learning workflows. The authors note that while this optimization process is typically focused on maximizing predictive accuracy, many real-world applications require considering additional objectives or constraints, leading to a multi-objective optimization problem.

To address this, the authors provide an extensive survey of existing optimization strategies from the domains of evolutionary algorithms and Bayesian optimization. They discuss the strengths and weaknesses of these approaches and how they can be adapted to handle multiple, potentially conflicting objectives.

The authors then illustrate the utility of multi-objective optimization in several specific machine learning applications, considering objectives such as operating conditions, prediction time, sparseness, fairness, interpretability, and robustness. These examples showcase how this approach can lead to more well-rounded and practical machine learning models, compared to focusing solely on predictive accuracy.

Critical Analysis

The paper provides a comprehensive overview of the topic of multi-objective hyperparameter optimization, highlighting its importance and the various optimization strategies that can be employed. The authors acknowledge that this area is often neglected in practice, which underscores the value of this work in raising awareness and providing guidance.

One potential limitation of the paper is that it does not delve deeply into the specific challenges and considerations that may arise when applying multi-objective optimization in different domains or when dealing with complex machine learning pipelines. Additional case studies or more detailed discussions on the practical implementation and potential pitfalls could further strengthen the paper's utility for practitioners.

Nevertheless, the authors successfully demonstrate the benefits of considering multiple objectives in hyperparameter optimization and provide a solid foundation for researchers and engineers to explore this topic further. Encouraging readers to think critically about the research and its implications is a valuable aspect of this work.

Conclusion

This paper highlights the importance of moving beyond single-objective hyperparameter optimization in machine learning, and instead considering multiple, potentially conflicting objectives. By surveying various optimization strategies and illustrating their application in specific use cases, the authors make a compelling case for the adoption of multi-objective optimization techniques in real-world machine learning workflows.

The insights presented in this work can help researchers and practitioners develop more well-rounded and practical machine learning models, optimized not just for predictive accuracy, but also for desirable properties such as speed, interpretability, fairness, and robustness. As machine learning systems become increasingly integrated into critical decision-making processes, this holistic approach to hyperparameter optimization will become increasingly important.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Trajectory-Based Multi-Objective Hyperparameter Optimization for Model Retraining

Wenyu Wang, Zheyi Fan, Szu Hui Ng

Training machine learning models inherently involves a resource-intensive and noisy iterative learning procedure that allows epoch-wise monitoring of the model performance. However, in multi-objective hyperparameter optimization scenarios, the insights gained from the iterative learning procedure typically remain underutilized. We notice that tracking the model performance across multiple epochs under a hyperparameter setting creates a trajectory in the objective space and that trade-offs along the trajectories are often overlooked despite their potential to offer valuable insights to decision-making for model retraining. Therefore, in this study, we propose to enhance the multi-objective hyperparameter optimization problem by having training epochs as an additional decision variable to incorporate trajectory information. Correspondingly, we present a novel trajectory-based multi-objective Bayesian optimization algorithm characterized by two features: 1) an acquisition function that captures the improvement made by the predictive trajectory of any hyperparameter setting and 2) a multi-objective early stopping mechanism that determines when to terminate the trajectory to maximize epoch efficiency. Numerical experiments on diverse synthetic simulations and hyperparameter tuning benchmarks indicate that our algorithm outperforms the state-of-the-art multi-objective optimizers in both locating better trade-offs and tuning efficiency.

5/27/2024

cs.LG

🛠️

Common pitfalls to avoid while using multiobjective optimization in machine learning

Junaid Akhter, Paul David Fahrmann, Konstantin Sonntag, Sebastian Peitz

Recently, there has been an increasing interest in exploring the application of multiobjective optimization (MOO) in machine learning (ML). The interest is driven by the numerous situations in real-life applications where multiple objectives need to be optimized simultaneously. A key aspect of MOO is the existence of a Pareto set, rather than a single optimal solution, which illustrates the inherent trade-offs between objectives. Despite its potential, there is a noticeable lack of satisfactory literature that could serve as an entry-level guide for ML practitioners who want to use MOO. Hence, our goal in this paper is to produce such a resource. We critically review previous studies, particularly those involving MOO in deep learning (using Physics-Informed Neural Networks (PINNs) as a guiding example), and identify misconceptions that highlight the need for a better grasp of MOO principles in ML. Using MOO of PINNs as a case study, we demonstrate the interplay between the data loss and the physics loss terms. We highlight the most common pitfalls one should avoid while using MOO techniques in ML. We begin by establishing the groundwork for MOO, focusing on well-known approaches such as the weighted sum (WS) method, alongside more complex techniques like the multiobjective gradient descent algorithm (MGDA). Additionally, we compare the results obtained from the WS and MGDA with one of the most common evolutionary algorithms, NSGA-II. We emphasize the importance of understanding the specific problem, the objective space, and the selected MOO method, while also noting that neglecting factors such as convergence can result in inaccurate outcomes and, consequently, a non-optimal solution. Our goal is to offer a clear and practical guide for ML practitioners to effectively apply MOO, particularly in the context of DL.

5/3/2024

cs.LG

Enhancing Multi-Objective Optimization through Machine Learning-Supported Multiphysics Simulation

Diego Botache, Jens Decke, Winfried Ripken, Abhinay Dornipati, Franz Gotz-Hahn, Mohamed Ayeb, Bernhard Sick

This paper presents a methodological framework for training, self-optimising, and self-organising surrogate models to approximate and speed up multiobjective optimisation of technical systems based on multiphysics simulations. At the hand of two real-world datasets, we illustrate that surrogate models can be trained on relatively small amounts of data to approximate the underlying simulations accurately. Including explainable AI techniques allow for highlighting feature relevancy or dependencies and supporting the possible extension of the used datasets. One of the datasets was created for this paper and is made publicly available for the broader scientific community. Extensive experiments combine four machine learning and deep learning algorithms with an evolutionary optimisation algorithm. The performance of the combined training and optimisation pipeline is evaluated by verifying the generated Pareto-optimal results using the ground truth simulations. The results from our pipeline and a comprehensive evaluation strategy show the potential for efficiently acquiring solution candidates in multiobjective optimisation tasks by reducing the number of simulations and conserving a higher prediction accuracy, i.e., with a MAPE score under 5% for one of the presented use cases.

4/4/2024

cs.LG

⛏️

Hyperparameter Importance Analysis for Multi-Objective AutoML

Daphne Theodorakopoulos, Frederic Stahl, Marius Lindauer

Hyperparameter optimization plays a pivotal role in enhancing the predictive performance and generalization capabilities of ML models. However, in many applications, we do not only care about predictive performance but also about objectives such as inference time, memory, or energy consumption. In such MOO scenarios, determining the importance of hyperparameters poses a significant challenge due to the complex interplay between the conflicting objectives. In this paper, we propose the first method for assessing the importance of hyperparameters in the context of multi-objective hyperparameter optimization. Our approach leverages surrogate-based hyperparameter importance (HPI) measures, i.e. fANOVA and ablation paths, to provide insights into the impact of hyperparameters on the optimization objectives. Specifically, we compute the a-priori scalarization of the objectives and determine the importance of the hyperparameters for different objective tradeoffs. Through extensive empirical evaluations on diverse benchmark datasets with three different objectives paired with accuracy, namely time, demographic parity, and energy consumption, we demonstrate the effectiveness and robustness of our proposed method. Our findings not only offer valuable guidance for hyperparameter tuning in MOO tasks but also contribute to advancing the understanding of HPI in complex optimization scenarios.

5/16/2024

cs.LG cs.AI