On Quantile Randomized Kaczmarz for Linear Systems with Time-Varying Noise and Corruption

Read original: arXiv:2403.19874 - Published 4/1/2024 by Nestor Coria, Jamie Haddock, Jaime Pacheco

↗️

Introduction

The paper considers solving large-scale linear systems that are perturbed by both time-varying noise and corruption. The authors propose using the quantile randomized Kaczmarz (QRK) method, which avoids updates using equations with residuals larger than a specified quantile, to solve such systems.

The main contributions are:

A proof that QRK converges linearly in expectation up to a certain horizon, even with time-varying corruption and noise. The convergence rate depends on the corruption level while the horizon depends on both corruption and noise.
Specializations of the main convergence theorem to cases where the noise is uniformly bounded, follows a known distribution, or is Gaussian.
A result showing that with high probability, after sufficiently many iterations the largest entries of the QRK residual identify the indices of the corrupted equations at that iteration, even with time-varying corruption and noise.

The results demonstrate QRK's robustness to non-static noise and corruption when solving large-scale linear systems. The analysis carefully quantifies the impact of the corruption and noise parameters on the convergence rate and horizon.

Theoretical Results

This section proves the main results stated in the paper. First, it analyzes the convergence of the randomized Kaczmarz method when applied to a linear system perturbed by time-varying noise. This serves as a useful lemma for the subsequent proofs.

The main result, Theorem 1.4, is then proven. Key steps include bounding the quantile of the corrupted residual in terms of the error norm and noise norm (Lemma 2.2), analyzing the expected error when projecting onto a corrupted row (Lemma 2.3), and deriving a recursive expected error bound by considering the probabilities of the different row sampling scenarios.

Corollaries of the main theorem are proven next. Corollary 1.4.1 specializes the main result to the cases of (a) uniformly bounded noise, (b) noise from a known distribution, and (c) Gaussian noise. Corollary 1.4.2 uses Markov's inequality to lower bound the probability of detecting the corrupted equations by examining the largest residual entries.

The proofs leverage probabilistic arguments, norm inequalities, and the assumptions on the linear system, noise, and corruption models. The derived bounds characterize the convergence rate and robustness of the quantile-based randomized Kaczmarz method under time-varying corruption.

Numerical Experiments

The provided section presents numerical experiments using the Quantile Randomized Kaczmarz (QRK) method with various quantiles q. The experiments compare the theoretical convergence guarantee from Theorem 1.4 to the empirical performance of QRK, measured by the approximation error.

Key points:

The experiments use systems with static vs. time-varying corruption and noise. The results show that time-varying aspects have little effect on QRK's behavior or the provided bound.
QRK's empirical behavior and the upper bound from Theorem 1.4 are compared on systems with time-varying noise and corruption. Three plots illustrate how performance varies with respect to the quantile size q, corruption rate β, and noise standard deviation σ.
The fraction of corrupted indices detected by the largest magnitude entries of the QRK residual is compared to the lower bound probability provided by Corollary 1.4.2. The bound is a conservative estimate, likely due to the looseness of the convergence rate upper bound.
The Randomized Kaczmarz (RK) method is tested on systems with time-varying noise. The upper bounds from Theorem 2.1 are quite tight due to the consistent noise across iterations.

The experiments demonstrate the effectiveness of QRK in handling systems with time-varying noise and corruption. They also validate the theoretical bounds, although some bounds may be loose in certain scenarios.

Conclusion

This paper proves that the QRK method converges even when perturbed by time-varying noise and corruption. The authors provide an upper bound on the error of the QRK iterates, which includes a convergence horizon term determined by the magnitude of the time-varying noise and a rate term that depends on the time-varying corruption rate. The result is specialized to several noise models: bounded noise, random noise with given mean and standard deviation, and Gaussian noise. The authors also provide a lower bound on the probability of revealing the corrupted indices by examining the QRK iterate residuals. Numerical experiments illustrate the theoretical results and suggest that many of the estimates are conservative. The results hold for time-varying and potentially adversarial corruption. Future research directions include investigating whether QRK can converge for a larger fraction of corruptions (near 50%) when the corrupted indices are chosen at random, and developing a variant of RK that is robust to both corruption and noise simultaneously.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

↗️

On Quantile Randomized Kaczmarz for Linear Systems with Time-Varying Noise and Corruption

Nestor Coria, Jamie Haddock, Jaime Pacheco

Large-scale systems of linear equations arise in machine learning, medical imaging, sensor networks, and in many areas of data science. When the scale of the systems are extreme, it is common for a fraction of the data or measurements to be corrupted. The Quantile Randomized Kaczmarz (QRK) method is known to converge on large-scale systems of linear equations Ax=b that are inconsistent due to static corruptions in the measurement vector b. We prove that QRK converges even for systems corrupted by time-varying perturbations. Additionally, we prove that QRK converges up to a convergence horizon on systems affected by time-varying noise and demonstrate that the noise affects only the convergence horizon of the method, and not the rate of convergence. Finally, we utilize Markov's inequality to prove a lower bound on the probability that the largest entries of the QRK residual reveal the time-varying corruption in each iteration. We present numerical experiments which illustrate our theoretical results.

4/1/2024

🔍

A Note on Randomized Kaczmarz Algorithm for Solving Doubly-Noisy Linear Systems

El Houcine Bergou, Soumia Boucherouite, Aritra Dutta, Xin Li, Anna Ma

Large-scale linear systems, $Ax=b$, frequently arise in practice and demand effective iterative solvers. Often, these systems are noisy due to operational errors or faulty data-collection processes. In the past decade, the randomized Kaczmarz (RK) algorithm has been studied extensively as an efficient iterative solver for such systems. However, the convergence study of RK in the noisy regime is limited and considers measurement noise in the right-hand side vector, $b$. Unfortunately, in practice, that is not always the case; the coefficient matrix $A$ can also be noisy. In this paper, we analyze the convergence of RK for {textit{doubly-noisy} linear systems, i.e., when the coefficient matrix, $A$, has additive or multiplicative noise, and $b$ is also noisy}. In our analyses, the quantity $tilde R=| tilde A^{dagger} |^2 |tilde A |_F^2$ influences the convergence of RK, where $tilde A$ represents a noisy version of $A$. We claim that our analysis is robust and realistically applicable, as we do not require information about the noiseless coefficient matrix, $A$, and considering different conditions on noise, we can control the convergence of RK. {We perform numerical experiments to substantiate our theoretical findings.}

8/26/2024

fastkqr: A Fast Algorithm for Kernel Quantile Regression

Qian Tang, Yuwen Gu, Boxiang Wang

Quantile regression is a powerful tool for robust and heterogeneous learning that has seen applications in a diverse range of applied areas. However, its broader application is often hindered by the substantial computational demands arising from the non-smooth quantile loss function. In this paper, we introduce a novel algorithm named fastkqr, which significantly advances the computation of quantile regression in reproducing kernel Hilbert spaces. The core of fastkqr is a finite smoothing algorithm that magically produces exact regression quantiles, rather than approximations. To further accelerate the algorithm, we equip fastkqr with a novel spectral technique that carefully reutilizes matrix computations. In addition, we extend fastkqr to accommodate a flexible kernel quantile regression with a data-driven crossing penalty, addressing the interpretability challenges of crossing quantile curves at multiple levels. We have implemented fastkqr in a publicly available R package. Extensive simulations and real applications show that fastkqr matches the accuracy of state-of-the-art algorithms but can operate up to an order of magnitude faster.

8/13/2024

Optimal Kernel Quantile Learning with Random Features

Caixing Wang, Xingdong Feng

The random feature (RF) approach is a well-established and efficient tool for scalable kernel methods, but existing literature has primarily focused on kernel ridge regression with random features (KRR-RF), which has limitations in handling heterogeneous data with heavy-tailed noises. This paper presents a generalization study of kernel quantile regression with random features (KQR-RF), which accounts for the non-smoothness of the check loss in KQR-RF by introducing a refined error decomposition and establishing a novel connection between KQR-RF and KRR-RF. Our study establishes the capacity-dependent learning rates for KQR-RF under mild conditions on the number of RFs, which are minimax optimal up to some logarithmic factors. Importantly, our theoretical results, utilizing a data-dependent sampling strategy, can be extended to cover the agnostic setting where the target quantile function may not precisely align with the assumed kernel space. By slightly modifying our assumptions, the capacity-dependent error analysis can also be applied to cases with Lipschitz continuous losses, enabling broader applications in the machine learning community. To validate our theoretical findings, simulated experiments and a real data application are conducted.

8/27/2024