Convex Analysis at Infinity: An Introduction to Astral Space

Read original: arXiv:2205.03260 - Published 7/17/2024 by Miroslav Dud'ik, Robert E. Schapire, Matus Telgarsky

📶

Overview

Some convex functions on $\mathbb{R}^n$ do not have finite minimizers and can only be minimized by a sequence approaching infinity.
This paper aims to develop a theory for understanding such minimizers at infinity.
The authors study "astral space," a compact extension of $\mathbb{R}^n$ that includes points at infinity.
Astral space is constructed to allow continuous extension of linear functions, but it is not a vector space or metric space.
The authors develop concepts of convexity, conjugacy, and subdifferentials in astral space and analyze properties of convex functions, including minimizers, continuity, and convergence of descent algorithms.

Plain English Explanation

Not all mathematical functions that are "convex" (bowl-shaped) have a finite minimum value that can be easily found. Some of these functions can only be minimized by a sequence of points that gets closer and closer to infinity. In this paper, the researchers aim to develop a way to understand these minimizers that are at infinity.

They introduce a new mathematical space called "astral space" that includes $\mathbb{R}^n$ (the normal real number coordinate system) but also adds in points representing the "points at infinity." This astral space is designed to be as small as possible while still allowing important mathematical concepts like convexity, conjugacy (a way to transform functions), and subdifferentials (a generalization of derivatives) to be extended and studied.

Even though astral space contains all of $\mathbb{R}^n$, it is not a vector space or a metric space (a space with a way to measure distance). However, the researchers show that astral space has enough structure to allow them to analyze the detailed properties of convex functions, including where their minimizers are located, when they are continuous, and how optimization algorithms converge when used on these functions.

Technical Explanation

The paper studies the problem of minimizing convex functions on $\mathbb{R}^n$ that do not have finite minimizers. To address this, the authors introduce the concept of "astral space," a compact extension of $\mathbb{R}^n$ that includes points representing minimizers at infinity.

Astral space is constructed to be the smallest possible space that still allows all linear functions on $\mathbb{R}^n$ to be continuously extended. This is achieved by adding "astral points" to $\mathbb{R}^n$, representing the limits of sequences that diverge to infinity. Although astral space contains $\mathbb{R}^n$, it is not a vector space or a metric space.

Despite its lack of vector or metric structure, the authors show that astral space retains enough structure to allow the meaningful extension of concepts like convexity, conjugacy, and subdifferentials. They develop these notions in detail and use them to analyze the properties of convex functions on astral space, including the structure of their minimizers, precise characterizations of continuity, and the convergence of descent algorithms.

Critical Analysis

The paper provides a novel and rigorous mathematical framework for studying convex optimization problems that lack finite minimizers. By introducing astral space, the authors have created a compact setting in which these "minimizers at infinity" can be analyzed and understood.

One potential limitation of the work is that astral space, while well-structured, does not have the familiar vector or metric properties of $\mathbb{R}^n$. This could make it challenging to apply standard optimization techniques and intuitions in this new setting. The authors acknowledge this and focus on developing new tools and concepts tailored to the astral space setting.

Additionally, the paper is highly technical and may be inaccessible to readers without a strong background in convex analysis and functional analysis. The authors could potentially expand on the intuitions and practical implications of their results to make the work more accessible to a broader audience.

Overall, this paper represents an important contribution to the understanding of convex optimization problems with non-finite minimizers. The authors' development of the astral space framework opens up new avenues for research and could have implications for physics-informed neural networks, finite-dimensional approximations, mirror-preconditioned gradient descent, and approximation theory in deep learning, particularly in cases where non-geodesically convex optimization is a concern.

Conclusion

This paper introduces a novel mathematical framework, called astral space, to study convex optimization problems that lack finite minimizers. By extending the real number system $\mathbb{R}^n$ to include points at infinity, the authors develop a compact setting in which key concepts like convexity, conjugacy, and subdifferentials can be meaningfully extended and analyzed.

The detailed study of convex functions on astral space provides new insights into the structure of minimizers at infinity and the behavior of optimization algorithms in these settings. While the technical nature of the work may limit its immediate accessibility, the authors' contributions open up promising directions for further research and potential applications in fields like machine learning and physics-informed modeling.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📶

Convex Analysis at Infinity: An Introduction to Astral Space

Miroslav Dud'ik, Robert E. Schapire, Matus Telgarsky

Not all convex functions on $mathbb{R}^n$ have finite minimizers; some can only be minimized by a sequence as it heads to infinity. In this work, we aim to develop a theory for understanding such minimizers at infinity. We study astral space, a compact extension of $mathbb{R}^n$ to which such points at infinity have been added. Astral space is constructed to be as small as possible while still ensuring that all linear functions can be continuously extended to the new space. Although astral space includes all of $mathbb{R}^n$, it is not a vector space, nor even a metric space. However, it is sufficiently well-structured to allow useful and meaningful extensions of concepts of convexity, conjugacy, and subdifferentials. We develop these concepts and analyze various properties of convex functions on astral space, including the detailed structure of their minimizers, exact characterizations of continuity, and convergence of descent algorithms.

7/17/2024

🌀

Global optimality under amenable symmetry constraints

Peter Orbanz

Consider a convex function that is invariant under an group of transformations. If it has a minimizer, does it also have an invariant minimizer? Variants of this problem appear in nonparametric statistics and in a number of adjacent fields. The answer depends on the choice of function, and on what one may loosely call the geometry of the problem -- the interplay between convexity, the group, and the underlying vector space, which is typically infinite-dimensional. We observe that this geometry is completely encoded in the smallest closed convex invariant subsets of the space, and proceed to study these sets, for groups that are amenable but not necessarily compact. We then apply this toolkit to the invariant optimality problem. It yields new results on invariant kernel mean embeddings and risk-optimal invariant couplings, and clarifies relations between seemingly distinct ideas, such as the summation trick used in machine learning to construct equivariant neural networks and the classic Hunt-Stein theorem of statistics.

7/22/2024

🏋️

Astral: training physics-informed neural networks with error majorants

Vladimir Fanaskov, Tianchi Yu, Alexander Rudikov, Ivan Oseledets

The primal approach to physics-informed learning is a residual minimization. We argue that residual is, at best, an indirect measure of the error of approximate solution and propose to train with error majorant instead. Since error majorant provides a direct upper bound on error, one can reliably estimate how close PiNN is to the exact solution and stop the optimization process when the desired accuracy is reached. We call loss function associated with error majorant $textbf{Astral}$: neur$textbf{A}$l a po$textbf{ST}$erio$textbf{RI}$ function$textbf{A}$l Loss. To compare Astral and residual loss functions, we illustrate how error majorants can be derived for various PDEs and conduct experiments with diffusion equations (including anisotropic and in the L-shaped domain), convection-diffusion equation, temporal discretization of Maxwell's equation, and magnetostatics problem. The results indicate that Astral loss is competitive to the residual loss, typically leading to faster convergence and lower error (e.g., for Maxwell's equations, we observe an order of magnitude better relative error and training time). We also report that the error estimate obtained with Astral loss is usually tight enough to be informative, e.g., for a highly anisotropic equation, on average, Astral overestimates error by a factor of $1.5$, and for convection-diffusion by a factor of $1.7$.

6/6/2024

🔗

Invariant kernels on Riemannian symmetric spaces: a harmonic-analytic approach

Nathael Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega, Salem Said

This work aims to prove that the classical Gaussian kernel, when defined on a non-Euclidean symmetric space, is never positive-definite for any choice of parameter. To achieve this goal, the paper develops new geometric and analytical arguments. These provide a rigorous characterization of the positive-definiteness of the Gaussian kernel, which is complete but for a limited number of scenarios in low dimensions that are treated by numerical computations. Chief among these results are the L$^{!scriptscriptstyle p}$-$hspace{0.02cm}$Godement theorems (where $p = 1,2$), which provide verifiable necessary and sufficient conditions for a kernel defined on a symmetric space of non-compact type to be positive-definite. A celebrated theorem, sometimes called the Bochner-Godement theorem, already gives such conditions and is far more general in its scope, but is especially hard to apply. Beyond the connection with the Gaussian kernel, the new results in this work lay out a blueprint for the study of invariant kernels on symmetric spaces, bringing forth specific harmonic analysis tools that suggest many future applications.

9/9/2024