Riemannian Projection-free Online Learning

2305.19349

Published 6/4/2024 by Zihao Hu, Guanghui Wang, Jacob Abernethy

❗

Abstract

The projection operation is a critical component in a wide range of optimization algorithms, such as online gradient descent (OGD), for enforcing constraints and achieving optimal regret bounds. However, it suffers from computational complexity limitations in high-dimensional settings or when dealing with ill-conditioned constraint sets. Projection-free algorithms address this issue by replacing the projection oracle with more efficient optimization subroutines. But to date, these methods have been developed primarily in the Euclidean setting, and while there has been growing interest in optimization on Riemannian manifolds, there has been essentially no work in trying to utilize projection-free tools here. An apparent issue is that non-trivial affine functions are generally non-convex in such domains. In this paper, we present methods for obtaining sub-linear regret guarantees in online geodesically convex optimization on curved spaces for two scenarios: when we have access to (a) a separation oracle or (b) a linear optimization oracle. For geodesically convex losses, and when a separation oracle is available, our algorithms achieve $O(T^{1/2}:)$ and $O(T^{3/4};)$ adaptive regret guarantees in the full information setting and the bandit setting, respectively. When a linear optimization oracle is available, we obtain regret rates of $O(T^{3/4};)$ for geodesically convex losses and $O(T^{2/3}; log T )$ for strongly geodesically convex losses.

Create account to get full access

Overview

This paper addresses the challenge of computational complexity in optimization algorithms with constraints, particularly in high-dimensional settings or with ill-conditioned constraint sets.
It introduces projection-free algorithms as a solution, which replace the projection oracle with more efficient optimization subroutines.
The paper focuses on extending projection-free methods to optimization on Riemannian manifolds, where non-trivial affine functions are generally non-convex.

Plain English Explanation

Optimization algorithms are essential tools used to solve a wide range of problems, from machine learning to engineering. One common approach is called online gradient descent (OGD), which involves repeatedly adjusting the solution to minimize an objective function while incorporating new data.

A key component of OGD and other optimization algorithms is the projection operation, which ensures that the solution satisfies certain constraints. This helps the algorithm converge to the optimal solution more quickly and accurately.

However, the projection operation can be computationally expensive, especially in high-dimensional settings or when dealing with constraint sets that are difficult to work with (i.e., "ill-conditioned"). This can limit the practical applicability of these algorithms.

To address this issue, the researchers in this paper explore projection-free algorithms, which replace the projection step with more efficient optimization subroutines. This can significantly improve the computational efficiency of the algorithms.

The researchers further extend these projection-free methods to optimization problems on curved spaces, known as Riemannian manifolds. In these domains, the typical affine functions used in optimization are generally non-convex, which presents additional challenges.

By developing new algorithms and analysis techniques, the researchers show that it is possible to achieve sublinear regret guarantees (i.e., the algorithm's performance is close to the optimal solution) for online optimization on Riemannian manifolds, even without the computationally expensive projection step.

Technical Explanation

The paper introduces two projection-free algorithms for online optimization on Riemannian manifolds:

Separation Oracle Algorithm: When a separation oracle is available (a subroutine that can determine if a point is feasible and, if not, provide a separating hyperplane), the algorithm achieves an O(T^{1/2}) regret guarantee in the full information setting and an O(T^{3/4}) regret guarantee in the bandit setting, where T is the number of iterations.
Linear Optimization Oracle Algorithm: When a linear optimization oracle is available (a subroutine that can find the point on the constraint set that maximizes a linear function), the algorithm achieves an O(T^{3/4}) regret guarantee for geodesically convex losses and an O(T^{2/3} log T) regret guarantee for strongly geodesically convex losses.

The key technical contributions of the paper include:

Developing projection-free algorithms for online optimization on Riemannian manifolds, which extends previous work focused on the Euclidean setting.
Analyzing the regret guarantees of these algorithms, showing that they can achieve sublinear regret even without the computationally expensive projection step.
Handling the challenge of non-convex affine functions on Riemannian manifolds, which is a departure from the typical convex optimization settings.

Critical Analysis

The paper addresses an important practical challenge in optimization algorithms and provides a novel solution by extending projection-free methods to the Riemannian manifold setting. The analysis and algorithms presented are technically sound and contribute to the growing body of work on optimization on curved spaces.

One potential limitation of the research is the reliance on the availability of oracles (separation or linear optimization) to achieve the regret guarantees. In practice, these oracles may not always be readily available or easy to implement, which could limit the real-world applicability of the algorithms.

Additionally, the paper focuses on the theoretical analysis and does not provide extensive empirical validation of the proposed methods. It would be valuable to see how these projection-free algorithms perform compared to alternative approaches in realistic optimization problems on Riemannian manifolds.

Further research could explore ways to relax the assumptions on the oracles or investigate the performance of these algorithms in specific application domains, such as optimization problems in machine learning or robotics on Riemannian manifolds.

Conclusion

This paper presents a novel approach to addressing the computational challenges of optimization algorithms with constraints, particularly in high-dimensional settings or on curved spaces known as Riemannian manifolds. By introducing projection-free algorithms, the researchers have developed methods that can achieve sublinear regret guarantees without the need for the computationally expensive projection step.

The key contributions of this work include extending projection-free techniques to the Riemannian manifold setting, handling the challenge of non-convex affine functions, and providing theoretical regret analyses for the proposed algorithms. While the reliance on oracles may limit the immediate practical applicability, this research represents an important step forward in addressing the computational bottlenecks in optimization algorithms and opens up new avenues for further exploration in this field.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🌿

Projection-free Online Learning over Strongly Convex Sets

Yuanyu Wan, Lijun Zhang

To efficiently solve online problems with complicated constraints, projection-free algorithms including online frank-wolfe (OFW) and its variants have received significant interest recently. However, in the general case, existing efficient projection-free algorithms only achieved the regret bound of $O(T^{3/4})$, which is worse than the regret of projection-based algorithms, where $T$ is the number of decision rounds. In this paper, we study the special case of online learning over strongly convex sets, for which we first prove that OFW can enjoy a better regret bound of $O(T^{2/3})$ for general convex losses. The key idea is to refine the decaying step-size in the original OFW by a simple line search rule. Furthermore, for strongly convex losses, we propose a strongly convex variant of OFW by redefining the surrogate loss function in OFW. We show that it achieves a regret bound of $O(T^{2/3})$ over general convex sets and a better regret bound of $O(sqrt{T})$ over strongly convex sets.

6/26/2024

cs.LG

🛠️

Universal Online Convex Optimization with $1$ Projection per Round

Wenhao Yang, Yibo Wang, Peng Zhao, Lijun Zhang

To address the uncertainty in function types, recent progress in online convex optimization (OCO) has spurred the development of universal algorithms that simultaneously attain minimax rates for multiple types of convex functions. However, for a $T$-round online problem, state-of-the-art methods typically conduct $O(log T)$ projections onto the domain in each round, a process potentially time-consuming with complicated feasible sets. In this paper, inspired by the black-box reduction of Cutkosky and Orabona (2018), we employ a surrogate loss defined over simpler domains to develop universal OCO algorithms that only require $1$ projection. Embracing the framework of prediction with expert advice, we maintain a set of experts for each type of functions and aggregate their predictions via a meta-algorithm. The crux of our approach lies in a uniquely designed expert-loss for strongly convex functions, stemming from an innovative decomposition of the regret into the meta-regret and the expert-regret. Our analysis sheds new light on the surrogate loss, facilitating a rigorous examination of the discrepancy between the regret of the original loss and that of the surrogate loss, and carefully controlling meta-regret under the strong convexity condition. In this way, with only $1$ projection per round, we establish optimal regret bounds for general convex, exponentially concave, and strongly convex functions simultaneously. Furthermore, we enhance the expert-loss to exploit the smoothness property, and demonstrate that our algorithm can attain small-loss regret for multiple types of convex and smooth functions.

5/31/2024

cs.LG

👨‍🏫

Fully Unconstrained Online Learning

Ashok Cutkosky, Zakaria Mhammedi

We provide an online learning algorithm that obtains regret $G|w_star|sqrt{Tlog(|w_star|Gsqrt{T})} + |w_star|^2 + G^2$ on $G$-Lipschitz convex losses for any comparison point $w_star$ without knowing either $G$ or $|w_star|$. Importantly, this matches the optimal bound $G|w_star|sqrt{T}$ available with such knowledge (up to logarithmic factors), unless either $|w_star|$ or $G$ is so large that even $G|w_star|sqrt{T}$ is roughly linear in $T$. Thus, it matches the optimal bound in all cases in which one can achieve sublinear regret, which arguably most interesting scenarios.

6/3/2024

cs.LG stat.ML

Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization

Wei Jiang, Sifan Yang, Wenhao Yang, Yibo Wang, Yuanyu Wan, Lijun Zhang

This paper investigates projection-free algorithms for stochastic constrained multi-level optimization. In this context, the objective function is a nested composition of several smooth functions, and the decision set is closed and convex. Existing projection-free algorithms for solving this problem suffer from two limitations: 1) they solely focus on the gradient mapping criterion and fail to match the optimal sample complexities in unconstrained settings; 2) their analysis is exclusively applicable to non-convex functions, without considering convex and strongly convex objectives. To address these issues, we introduce novel projection-free variance reduction algorithms and analyze their complexities under different criteria. For gradient mapping, our complexities improve existing results and match the optimal rates for unconstrained problems. For the widely-used Frank-Wolfe gap criterion, we provide theoretical guarantees that align with those for single-level problems. Additionally, by using a stage-wise adaptation, we further obtain complexities for convex and strongly convex functions. Finally, numerical experiments on different tasks demonstrate the effectiveness of our methods.

6/7/2024

cs.LG