Exact Recovery for System Identification with More Corrupt Data than Clean Data

Read original: arXiv:2305.10506 - Published 4/26/2024 by Baturalp Yalcin, Haixiang Zhang, Javad Lavaei, Murat Arcak

📊

Overview

This paper investigates the problem of system identification for linear discrete-time systems under adversarial attacks.
The authors analyze two lasso-type estimators and examine their asymptotic and non-asymptotic properties under different attack models.
Since the data samples are correlated, the existing results on lasso do not apply, and the authors provide new theoretical guarantees.
The paper shows that the sample complexity for exact recovery of the system dynamics is linear in the state dimension when attacks are periodic, and polynomial in the state dimension and attack probability when attacks are stochastic.
The estimators can still learn the true system dynamics even when more than half the data is compromised by attacks.

Plain English Explanation

This paper looks at the problem of system identification for linear, discrete-time systems when there are adversarial attacks on the data. The researchers analyzed two different types of estimators, called "lasso-type" estimators, to see how well they can recover the true system dynamics in the presence of these attacks.

Normally, the existing results on lasso estimators wouldn't apply here because the data samples are correlated, meaning they're related to each other. But the authors were able to provide new theoretical guarantees on the performance of their estimators.

They considered two different scenarios for the attacks: one where the attacks happen periodically, and one where they occur randomly at each time point with a certain probability. In the periodic case, they showed that the number of samples needed to exactly recover the system dynamics is proportional to the number of state variables. In the random case, the required number of samples scales polynomially with the state dimension and the attack probability.

Interestingly, the researchers found that their estimators can still learn the true system correctly even when more than half the data is compromised by the attacks. This is an important result, as it means these techniques could be useful in situations where there is a lot of noisy or corrupted data.

Technical Explanation

The paper investigates the system identification problem for linear discrete-time systems under adversarial attacks. The authors analyze two lasso-type estimators and examine their asymptotic and non-asymptotic properties in two scenarios: deterministic and stochastic models for the attack times.

Since the data samples are correlated, the existing results on lasso do not apply. The authors prove that when the system is stable and attacks are periodic, the sample complexity for exact recovery of the system dynamics is linear in the state dimension. When attacks occur stochastically at each time with probability p, the required sample complexity scales polynomially in the state dimension and p.

This implies almost sure convergence to the true system dynamics under the asymptotic regime. Importantly, the estimators can still learn the system correctly even when more than half the data is compromised by attacks.

The authors highlight that the attack vectors are allowed to be correlated, while making some assumptions about the attack times. This paper provides the first mathematical guarantees on learning from correlated data for dynamical systems in the case where there is less clean data than corrupt data.

Critical Analysis

The paper provides strong theoretical guarantees for the performance of lasso-type estimators in the presence of adversarial attacks on linear, discrete-time systems. The authors' ability to handle correlated data samples and provide non-asymptotic bounds is a significant contribution.

However, the paper does make some assumptions, such as the periodic or stochastic nature of the attacks. In practice, the attack patterns may be more complex and harder to model. Additionally, the paper does not consider the case where the attack vectors are not correlated, which may be an important scenario to investigate.

Further research could explore the robustness of these estimators to more general attack models, as well as their performance in the presence of model misspecification or other sources of noise. Algorithmic approaches to recovering coefficients of linearizable differential equations may also provide useful insights for this problem.

Overall, this paper represents an important step forward in the field of system identification under adversarial attacks, and the insights and techniques developed here could have significant practical implications.

Conclusion

This paper tackles the problem of system identification for linear, discrete-time systems under adversarial attacks. The authors analyze two lasso-type estimators and provide theoretical guarantees on their performance, even when a majority of the data is compromised.

The key findings are that the sample complexity for exact recovery of the system dynamics is linear in the state dimension for periodic attacks, and polynomial in the state dimension and attack probability for stochastic attacks. These results have important implications for learning from corrupted data in dynamical systems, which is a significant challenge in many real-world applications.

Overall, this work represents an important contribution to the field of system identification and control under adversarial conditions, and the techniques developed here could inspire further research in this direction.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📊

Exact Recovery for System Identification with More Corrupt Data than Clean Data

Baturalp Yalcin, Haixiang Zhang, Javad Lavaei, Murat Arcak

This paper investigates the system identification problem for linear discrete-time systems under adversaries and analyzes two lasso-type estimators. We examine both asymptotic and non-asymptotic properties of these estimators in two separate scenarios, corresponding to deterministic and stochastic models for the attack times. Since the samples collected from the system are correlated, the existing results on lasso are not applicable. We prove that when the system is stable and attacks are injected periodically, the sample complexity for exact recovery of the system dynamics is linear in terms of the dimension of the states. When adversarial attacks occur at each time instance with probability p, the required sample complexity for exact recovery scales polynomially in the dimension of the states and the probability p. This result implies almost sure convergence to the true system dynamics under the asymptotic regime. As a by-product, our estimators still learn the system correctly even when more than half of the data is compromised. We highlight that the attack vectors are allowed to be correlated with each other in this work, whereas we make some assumptions about the times at which the attacks happen. This paper provides the first mathematical guarantee in the literature on learning from correlated data for dynamical systems in the case when there is less clean data than corrupt data.

4/26/2024

Exact Recovery Guarantees for Parameterized Non-linear System Identification Problem under Adversarial Attacks

Haixiang Zhang, Baturalp Yalcin, Javad Lavaei, Eduardo D. Sontag

In this work, we study the system identification problem for parameterized non-linear systems using basis functions under adversarial attacks. Motivated by the LASSO-type estimators, we analyze the exact recovery property of a non-smooth estimator, which is generated by solving an embedded $ell_1$-loss minimization problem. First, we derive necessary and sufficient conditions for the well-specifiedness of the estimator and the uniqueness of global solutions to the underlying optimization problem. Next, we provide exact recovery guarantees for the estimator under two different scenarios of boundedness and Lipschitz continuity of the basis functions. The non-asymptotic exact recovery is guaranteed with high probability, even when there are more severely corrupted data than clean data. Finally, we numerically illustrate the validity of our theory. This is the first study on the sample complexity analysis of a non-smooth estimator for the non-linear system identification problem.

9/17/2024

Learning Unstable Continuous-Time Stochastic Linear Control Systems

Reza Sadeghi Hafshejani, Mohamad Kazem Shirani Fradonbeh

We study the problem of system identification for stochastic continuous-time dynamics, based on a single finite-length state trajectory. We present a method for estimating the possibly unstable open-loop matrix by employing properly randomized control inputs. Then, we establish theoretical performance guarantees showing that the estimation error decays with trajectory length, a measure of excitability, and the signal-to-noise ratio, while it grows with dimension. Numerical illustrations that showcase the rates of learning the dynamics, will be provided as well. To perform the theoretical analysis, we develop new technical tools that are of independent interest. That includes non-asymptotic stochastic bounds for highly non-stationary martingales and generalized laws of iterated logarithms, among others.

9/18/2024

A least-square method for non-asymptotic identification in linear switching control

Haoyuan Sun, Ali Jadbabaie

The focus of this paper is on linear system identification in the setting where it is known that the underlying partially-observed linear dynamical system lies within a finite collection of known candidate models. We first consider the problem of identification from a given trajectory, which in this setting reduces to identifying the index of the true model with high probability. We characterize the finite-time sample complexity of this problem by leveraging recent advances in the non-asymptotic analysis of linear least-square methods in the literature. In comparison to the earlier results that assume no prior knowledge of the system, our approach takes advantage of the smaller hypothesis class and leads to the design of a learner with a dimension-free sample complexity bound. Next, we consider the switching control of linear systems, where there is a candidate controller for each of the candidate models and data is collected through interaction of the system with a collection of potentially destabilizing controllers. We develop a dimension-dependent criterion that can detect those destabilizing controllers in finite time. By leveraging these results, we propose a data-driven switching strategy that identifies the unknown parameters of the underlying system. We then provide a non-asymptotic analysis of its performance and discuss its implications on the classical method of estimator-based supervisory control.

4/15/2024