An L-BFGS-B approach for linear and nonlinear system identification under $ell_1$- and group-Lasso regularization

Read original: arXiv:2403.03827 - Published 7/18/2024 by Alberto Bemporad

An L-BFGS-B approach for linear and nonlinear system identification under $ell_1$- and group-Lasso regularization

Overview

• This research paper proposes a method for identifying linear and nonlinear dynamical systems using ℓ₁- and group-Lasso regularization, which helps to identify sparse and group-sparse models.

• The authors use the L-BFGS-B optimization algorithm to efficiently solve the regularized system identification problem.

• The paper covers special cases, including linear time-invariant models, bilinear models, and switched affine systems, and provides theoretical guarantees for the proposed method.

Plain English Explanation

The research paper discusses a technique for modeling the behavior of complex systems, such as mechanical, electrical, or biological systems. These systems can often be described using mathematical models, but figuring out the right model can be challenging.

The authors propose a method that uses ℓ₁-regularization and group-Lasso regularization to help identify the right model. ℓ₁-regularization encourages sparsity, meaning the model will only use a small number of important variables. Group-Lasso regularization groups related variables together, which can be useful for certain types of systems.

The authors use a numerical optimization algorithm called L-BFGS-B to efficiently solve the regularized system identification problem. This allows them to find the best model that fits the available data.

The paper covers some special cases, like linear time-invariant models, bilinear models, and switched affine systems. It also provides theoretical guarantees about the performance of the proposed method.

Technical Explanation

The paper formulates the system identification problem as a regularized optimization problem, where the goal is to find a model that fits the observed data well while also being simple and interpretable. The authors consider ℓ₁-regularization, which encourages sparsity in the model parameters, and group-Lasso regularization, which can capture group structure in the parameters.

To efficiently solve the regularized problem, the authors use the L-BFGS-B optimization algorithm, which is a variant of the popular L-BFGS algorithm that can handle box constraints on the parameters. This allows the method to be applied to a wide range of system identification problems, including linear time-invariant models, bilinear models, and switched affine systems.

The paper provides theoretical guarantees on the performance of the proposed method, including bounds on the estimation error and conditions for model selection consistency. The authors also demonstrate the effectiveness of their approach through numerical experiments on both simulated and real-world data.

Critical Analysis

The paper presents a well-designed and thorough approach to the system identification problem, with a strong theoretical foundation and practical implementations. However, some potential limitations and areas for further research are:

The paper focuses on relatively simple model classes, such as linear time-invariant and bilinear systems. It would be interesting to see how the method performs on more complex nonlinear systems, perhaps by incorporating neural network-based approaches.
The paper assumes that the system parameters are time-invariant. It may be worthwhile to explore extensions of the method to handle time-varying systems, which are common in many real-world applications.
The theoretical analysis in the paper relies on certain assumptions, such as the availability of sufficiently rich data and the satisfaction of various technical conditions. It would be valuable to understand the robustness of the method to violations of these assumptions in practice.
The computational efficiency of the L-BFGS-B algorithm may be a concern for very large-scale problems. It could be interesting to investigate the use of faster and more scalable solvers for the regularized system identification problem.

Overall, the paper presents a promising approach to the system identification problem and opens up several avenues for further research and improvement.

Conclusion

This research paper introduces a method for identifying linear and nonlinear dynamical systems using ℓ₁- and group-Lasso regularization, which helps to identify sparse and group-sparse models. The authors use the L-BFGS-B optimization algorithm to efficiently solve the regularized system identification problem, and they provide theoretical guarantees for the proposed method.

The paper covers special cases, including linear time-invariant models, bilinear models, and switched affine systems, and demonstrates the effectiveness of the approach through numerical experiments. While the paper presents a solid foundation, there are opportunities for further research, such as exploring more complex nonlinear systems, handling time-varying systems, and investigating more efficient optimization algorithms.

Overall, this work contributes to the ongoing efforts to develop robust and interpretable methods for system identification, with potential applications in various fields, including engineering, physics, and biology.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

An L-BFGS-B approach for linear and nonlinear system identification under $ell_1$- and group-Lasso regularization

Alberto Bemporad

In this paper, we propose a very efficient numerical method based on the L-BFGS-B algorithm for identifying linear and nonlinear discrete-time state-space models, possibly under $ell_1$- and group-Lasso regularization for reducing model complexity. For the identification of linear models, we show that, compared to classical linear subspace methods, the approach often provides better results, is much more general in terms of the loss and regularization terms used (such as penalties for enforcing system stability), and is also more stable from a numerical point of view. The proposed method not only enriches the existing set of linear system identification tools but can also be applied to identifying a very broad class of parametric nonlinear state-space models, including recurrent neural networks. We illustrate the approach on synthetic and experimental datasets and apply it to solve a challenging industrial robot benchmark for nonlinear multi-input/multi-output system identification. A Python implementation of the proposed identification method is available in the package jax-sysid, available at https://github.com/bemporad/jax-sysid.

7/18/2024

🔎

Learning linear dynamical systems under convex constraints

Hemant Tyagi, Denis Efimov

We consider the problem of finite-time identification of linear dynamical systems from $T$ samples of a single trajectory. Recent results have predominantly focused on the setup where no structural assumption is made on the system matrix $A^* in mathbb{R}^{n times n}$, and have consequently analyzed the ordinary least squares (OLS) estimator in detail. We assume prior structural information on $A^*$ is available, which can be captured in the form of a convex set $mathcal{K}$ containing $A^*$. For the solution of the ensuing constrained least squares estimator, we derive non-asymptotic error bounds in the Frobenius norm that depend on the local size of $mathcal{K}$ at $A^*$. To illustrate the usefulness of these results, we instantiate them for four examples, namely when (i) $A^*$ is sparse and $mathcal{K}$ is a suitably scaled $ell_1$ ball; (ii) $mathcal{K}$ is a subspace; (iii) $mathcal{K}$ consists of matrices each of which is formed by sampling a bivariate convex function on a uniform $n times n$ grid (convex regression); (iv) $mathcal{K}$ consists of matrices each row of which is formed by uniform sampling (with step size $1/T$) of a univariate Lipschitz function. In all these situations, we show that $A^*$ can be reliably estimated for values of $T$ much smaller than what is needed for the unconstrained setting.

5/3/2024

A least-square method for non-asymptotic identification in linear switching control

Haoyuan Sun, Ali Jadbabaie

The focus of this paper is on linear system identification in the setting where it is known that the underlying partially-observed linear dynamical system lies within a finite collection of known candidate models. We first consider the problem of identification from a given trajectory, which in this setting reduces to identifying the index of the true model with high probability. We characterize the finite-time sample complexity of this problem by leveraging recent advances in the non-asymptotic analysis of linear least-square methods in the literature. In comparison to the earlier results that assume no prior knowledge of the system, our approach takes advantage of the smaller hypothesis class and leads to the design of a learner with a dimension-free sample complexity bound. Next, we consider the switching control of linear systems, where there is a candidate controller for each of the candidate models and data is collected through interaction of the system with a collection of potentially destabilizing controllers. We develop a dimension-dependent criterion that can detect those destabilizing controllers in finite time. By leveraging these results, we propose a data-driven switching strategy that identifies the unknown parameters of the underlying system. We then provide a non-asymptotic analysis of its performance and discuss its implications on the classical method of estimator-based supervisory control.

4/15/2024

A neural network-based approach to hybrid systems identification for control

Filippo Fabiani, Bartolomeo Stellato, Daniele Masti, Paul J. Goulart

We consider the problem of designing a machine learning-based model of an unknown dynamical system from a finite number of (state-input)-successor state data points, such that the model obtained is also suitable for optimal control design. We propose a specific neural network (NN) architecture that yields a hybrid system with piecewise-affine dynamics that is differentiable with respect to the network's parameters, thereby enabling the use of derivative-based training procedures. We show that a careful choice of our NN's weights produces a hybrid system model with structural properties that are highly favourable when used as part of a finite horizon optimal control problem (OCP). Specifically, we show that optimal solutions with strong local optimality guarantees can be computed via nonlinear programming, in contrast to classical OCPs for general hybrid systems which typically require mixed-integer optimization. In addition to being well-suited for optimal control design, numerical simulations illustrate that our NN-based technique enjoys very similar performance to state-of-the-art system identification methodologies for hybrid systems and it is competitive on nonlinear benchmarks.

4/3/2024