Square Root LASSO: Well-posedness, Lipschitz stability and the tuning trade off

Read original: arXiv:2303.15588 - Published 4/1/2024 by Aaron Berk, Simone Brugiapaglia, Tim Hoheisel

📉

The paper investigates the well-posedness and parameter sensitivity of the Square Root LASSO (SR-LASSO), an optimization model for obtaining sparse solutions to linear inverse problems in finite dimensions. The SR-LASSO offers an advantage over the standard LASSO in terms of the robust tuning of the regularization parameter with respect to measurement noise.

The paper proposes three point-based regularity conditions for a solution of the SR-LASSO: weak, intermediate, and strong assumptions. The weak assumption ensures the uniqueness of the solution in question. The intermediate assumption guarantees that the solution map is directionally differentiable and locally Lipschitz, with explicit Lipschitz bounds provided. The strong assumption establishes the continuous differentiability of the solution map around the point in question.

The analysis reveals new theoretical insights into the comparison between SR-LASSO and LASSO from the perspective of tuning parameter sensitivity. While the SR-LASSO allows for noise-robust optimal parameter choice, this comes at the cost of increased tuning parameter sensitivity. Numerical results are presented to support and illustrate the theoretical findings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📉

Square Root LASSO: Well-posedness, Lipschitz stability and the tuning trade off

Aaron Berk, Simone Brugiapaglia, Tim Hoheisel

This paper studies well-posedness and parameter sensitivity of the Square Root LASSO (SR-LASSO), an optimization model for recovering sparse solutions to linear inverse problems in finite dimension. An advantage of the SR-LASSO (e.g., over the standard LASSO) is that the optimal tuning of the regularization parameter is robust with respect to measurement noise. This paper provides three point-based regularity conditions at a solution of the SR-LASSO: the weak, intermediate, and strong assumptions. It is shown that the weak assumption implies uniqueness of the solution in question. The intermediate assumption yields a directionally differentiable and locally Lipschitz solution map (with explicit Lipschitz bounds), whereas the strong assumption gives continuous differentiability of said map around the point in question. Our analysis leads to new theoretical insights on the comparison between SR-LASSO and LASSO from the viewpoint of tuning parameter sensitivity: noise-robust optimal parameter choice for SR-LASSO comes at the price of elevated tuning parameter sensitivity. Numerical results support and showcase the theoretical findings.

4/1/2024

🏅

Algebraic and Statistical Properties of the Ordinary Least Squares Interpolator

Dennis Shen, Dogyoon Song, Peng Ding, Jasjeet S. Sekhon

Deep learning research has uncovered the phenomenon of benign overfitting for overparameterized statistical models, which has drawn significant theoretical interest in recent years. Given its simplicity and practicality, the ordinary least squares (OLS) interpolator has become essential to gain foundational insights into this phenomenon. While properties of OLS are well established in classical, underparameterized settings, its behavior in high-dimensional, overparameterized regimes is less explored (unlike for ridge or lasso regression) though significant progress has been made of late. We contribute to this growing literature by providing fundamental algebraic and statistical results for the minimum $ell_2$-norm OLS interpolator. In particular, we provide algebraic equivalents of (i) the leave-$k$-out residual formula, (ii) Cochran's formula, and (iii) the Frisch-Waugh-Lovell theorem in the overparameterized regime. These results aid in understanding the OLS interpolator's ability to generalize and have substantive implications for causal inference. Under the Gauss-Markov model, we present statistical results such as an extension of the Gauss-Markov theorem and an analysis of variance estimation under homoskedastic errors for the overparameterized regime. To substantiate our theoretical contributions, we conduct simulations that further explore the stochastic properties of the OLS interpolator.

5/31/2024

🔄

The Adaptive $tau$-Lasso: Robustness and Oracle Properties

Emadaldin Mozafari-Majd, Visa Koivunen

This paper introduces a new regularized version of the robust $tau$-regression estimator for analyzing high-dimensional datasets subject to gross contamination in the response variables and covariates (explanatory variables). The resulting estimator, termed adaptive $tau$-Lasso, is robust to outliers and high-leverage points. It also incorporates an adaptive $ell_1$-norm penalty term, which enables the selection of relevant variables and reduces the bias associated with large true regression coefficients. More specifically, this adaptive $ell_1$-norm penalty term assigns a weight to each regression coefficient. For a fixed number of predictors $p$, we show that the adaptive $tau$-Lasso has the oracle property, ensuring both variable-selection consistency and asymptotic normality. Asymptotic normality applies only to the entries of the regression vector corresponding to the true support, assuming knowledge of the true regression vector support. We characterize its robustness by establishing the finite-sample breakdown point and the influence function. We carry out extensive simulations and observe that the class of $tau$-Lasso estimators exhibits robustness and reliable performance in both contaminated and uncontaminated data settings. We also validate our theoretical findings on robustness properties through simulations. In the face of outliers and high-leverage points, the adaptive $tau$-Lasso and $tau$-Lasso estimators achieve the best performance or close-to-best performance in terms of prediction and variable selection accuracy compared to other competing regularized estimators for all scenarios considered in this study. Therefore, the adaptive $tau$-Lasso and $tau$-Lasso estimators provide attractive tools for a variety of sparse linear regression problems, particularly in high-dimensional settings and when the data is contaminated by outliers and high-leverage points.

8/12/2024

Parameter choice strategies for regularized least squares approximation of noisy continuous functions on the unit circle

Congpei An, Mou Cai

In this paper, we consider a trigonometric polynomial reconstruction of continuous periodic functions from their noisy values at equidistant nodes of the unit circle by a regularized least squares method. We indicate that the constructed trigonometric polynomial can be determined in explicit due to the exactness of trapezoidal rule. Then a concrete error bound is derived based on the estimation of the Lebesgue constant. In particular, we analyze three regularization parameter choice strategies: Morozov's discrepancy principal, L-curve and generalized cross-validation. Finally, numerical examples are given to perform that well chosen parameters by above strategy can improve approximation quality.

4/1/2024