Improved generalization with deep neural operators for engineering systems: Path towards digital twin

2301.06701

Published 4/30/2024 by Kazuma Kobayashi, James Daniell, Syed Bahauddin Alam

🤿

Abstract

Neural Operator Networks (ONets) represent a novel advancement in machine learning algorithms, offering a robust and generalizable alternative for approximating partial differential equations (PDEs) solutions. Unlike traditional Neural Networks (NN), which directly approximate functions, ONets specialize in approximating mathematical operators, enhancing their efficacy in addressing complex PDEs. In this work, we evaluate the capabilities of Deep Operator Networks (DeepONets), an ONets implementation using a branch/trunk architecture. Three test cases are studied: a system of ODEs, a general diffusion system, and the convection/diffusion Burgers equation. It is demonstrated that DeepONets can accurately learn the solution operators, achieving prediction accuracy scores above 0.96 for the ODE and diffusion problems over the observed domain while achieving zero shot (without retraining) capability. More importantly, when evaluated on unseen scenarios (zero shot feature), the trained models exhibit excellent generalization ability. This underscores ONets vital niche for surrogate modeling and digital twin development across physical systems. While convection-diffusion poses a greater challenge, the results confirm the promise of ONets and motivate further enhancements to the DeepONet algorithm. This work represents an important step towards unlocking the potential of digital twins through robust and generalizable surrogates.

Create account to get full access

Overview

A novel machine learning algorithm called Neural Operator Networks (ONets) is introduced, which can effectively approximate solutions to partial differential equations (PDEs).
Unlike traditional neural networks that directly approximate functions, ONets specialize in approximating mathematical operators, making them more effective at addressing complex PDEs.
The paper evaluates the capabilities of Deep Operator Networks (DeepONets), an implementation of ONets using a branch/trunk architecture, across three test cases.

Plain English Explanation

Neural Operator Networks (ONets) are a new type of machine learning algorithm that can be used to solve complex mathematical problems, like partial differential equations. Traditional neural networks try to directly approximate functions, but ONets are designed to approximate the mathematical operators that govern how those functions behave.

This is important because many real-world physical systems are described by PDEs, which are notoriously difficult to solve. By approximating the operators instead of the functions themselves, ONets can more effectively capture the underlying dynamics of these systems. The paper focuses on evaluating a specific implementation of ONets called Deep Operator Networks (DeepONets), which use a unique "branch/trunk" architecture.

The researchers tested DeepONets on three different PDE-based problems: a system of ordinary differential equations (ODEs), a general diffusion system, and the convection/diffusion Burgers equation. The results showed that DeepONets could accurately learn the solution operators, achieving very high prediction accuracy scores (over 0.96) for the ODE and diffusion problems.

Importantly, the trained DeepONet models were able to generalize to unseen scenarios, without needing to be retrained. This "zero-shot" capability is crucial for developing robust and versatile digital twins - virtual models that can accurately represent real-world physical systems. While the convection-diffusion problem posed a greater challenge, the overall results demonstrate the promise of ONets for surrogate modeling and digital twin development across a wide range of physical systems.

Technical Explanation

The paper introduces a novel machine learning framework called Neural Operator Networks (ONets), which are designed to approximate the solution operators of partial differential equations (PDEs), rather than directly approximating the functions themselves.

The researchers evaluate the capabilities of a specific implementation of ONets called Deep Operator Networks (DeepONets), which use a branch/trunk architecture. Three test cases are studied:

A system of ordinary differential equations (ODEs)
A general diffusion system
The convection/diffusion Burgers equation

The results show that DeepONets can accurately learn the solution operators, achieving prediction accuracy scores above 0.96 for the ODE and diffusion problems over the observed domain. Importantly, the trained models also exhibit excellent generalization ability when evaluated on unseen scenarios (zero-shot capability), without the need for retraining.

While the convection-diffusion problem poses a greater challenge, the overall findings confirm the promise of ONets and motivate further enhancements to the DeepONet algorithm. This work represents an important step towards unlocking the potential of digital twins through robust and generalizable surrogates, as demonstrated by the zero-shot performance of the trained models.

Critical Analysis

The paper provides a compelling demonstration of the capabilities of Deep Operator Networks (DeepONets) in approximating the solution operators of partial differential equations (PDEs). The researchers' choice of three diverse test cases, ranging from ordinary differential equations (ODEs) to the complex convection-diffusion Burgers equation, helps to showcase the versatility and generalization ability of the DeepONet approach.

One notable aspect of the research is the zero-shot capability exhibited by the trained DeepONet models, where they were able to accurately predict solutions for unseen scenarios without the need for retraining. This is a significant advantage over traditional neural network approaches, as it suggests that DeepONets can develop robust and transferable representations of the underlying mathematical operators.

However, the researchers acknowledge that the convection-diffusion problem posed a greater challenge, and they mention the need for further enhancements to the DeepONet algorithm to address such complex PDEs more effectively. It would be interesting to see if hybrid approaches, combining DeepONets with other techniques, could help improve the performance on these more challenging PDE systems.

Additionally, while the paper demonstrates the potential of DeepONets for surrogate modeling and digital twin development, it would be valuable to see further exploration of the practical applications and real-world implications of this technology. Incorporating more diverse PDE-based problems, as well as investigating the computational efficiency and scalability of DeepONets, could provide additional insights into their suitability for industry-scale deployment.

Conclusion

The Neural Operator Networks (ONets) framework, as exemplified by the Deep Operator Networks (DeepONets) implementation, represents a promising advancement in machine learning for approximating the solutions of partial differential equations (PDEs). By specializing in learning the underlying mathematical operators, rather than directly approximating functions, DeepONets have shown the ability to achieve high accuracy and robust generalization across a range of PDE-based test cases.

This work lays the foundation for leveraging the power of digital twins and surrogate modeling to unlock new possibilities in physical systems design, optimization, and real-time decision-making. As the researchers continue to refine and enhance the DeepONet algorithm, the potential for ONets to transform how we approach complex, PDE-driven problems in science and engineering is truly exciting.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🤿

Deep Neural Operator Driven Real Time Inference for Nuclear Systems to Enable Digital Twin Solutions

Kazuma Kobayashi, Syed Bahauddin Alam

This paper focuses on the feasibility of Deep Neural Operator (DeepONet) as a robust surrogate modeling method within the context of digital twin (DT) for nuclear energy systems. Through benchmarking and evaluation, this study showcases the generalizability and computational efficiency of DeepONet in solving a challenging particle transport problem. DeepONet also exhibits remarkable prediction accuracy and speed, outperforming traditional ML methods, making it a suitable algorithm for real-time DT inference. However, the application of DeepONet also reveals challenges related to optimal sensor placement and model evaluation, critical aspects of real-world implementation. Addressing these challenges will further enhance the method's practicality and reliability. Overall, DeepONet presents a promising and transformative nuclear engineering research and applications tool. Its accurate prediction and computational efficiency capabilities can revolutionize DT systems, advancing nuclear engineering research. This study marks an important step towards harnessing the power of surrogate modeling techniques in critical engineering domains.

4/30/2024

stat.ML cs.LG

Transformers as Neural Operators for Solutions of Differential Equations with Finite Regularity

Benjamin Shih, Ahmad Peyvan, Zhongqiang Zhang, George Em Karniadakis

Neural operator learning models have emerged as very effective surrogates in data-driven methods for partial differential equations (PDEs) across different applications from computational science and engineering. Such operator learning models not only predict particular instances of a physical or biological system in real-time but also forecast classes of solutions corresponding to a distribution of initial and boundary conditions or forcing terms. % DeepONet is the first neural operator model and has been tested extensively for a broad class of solutions, including Riemann problems. Transformers have not been used in that capacity, and specifically, they have not been tested for solutions of PDEs with low regularity. % In this work, we first establish the theoretical groundwork that transformers possess the universal approximation property as operator learning models. We then apply transformers to forecast solutions of diverse dynamical systems with solutions of finite regularity for a plurality of initial conditions and forcing terms. In particular, we consider three examples: the Izhikevich neuron model, the tempered fractional-order Leaky Integrate-and-Fire (LIF) model, and the one-dimensional Euler equation Riemann problem. For the latter problem, we also compare with variants of DeepONet, and we find that transformers outperform DeepONet in accuracy but they are computationally more expensive.

5/30/2024

cs.LG cs.AI

An Advanced Physics-Informed Neural Operator for Comprehensive Design Optimization of Highly-Nonlinear Systems: An Aerospace Composites Processing Case Study

Milad Ramezankhani, Anirudh Deodhar, Rishi Yash Parekh, Dagnachew Birru

Deep Operator Networks (DeepONets) and their physics-informed variants have shown significant promise in learning mappings between function spaces of partial differential equations, enhancing the generalization of traditional neural networks. However, for highly nonlinear real-world applications like aerospace composites processing, existing models often fail to capture underlying solutions accurately and are typically limited to single input functions, constraining rapid process design development. This paper introduces an advanced physics-informed DeepONet tailored for such complex systems with multiple input functions. Equipped with architectural enhancements like nonlinear decoders and effective training strategies such as curriculum learning and domain decomposition, the proposed model handles high-dimensional design spaces with significantly improved accuracy, outperforming the vanilla physics-informed DeepONet by two orders of magnitude. Its zero-shot prediction capability across a broad design space makes it a powerful tool for accelerating composites process design and optimization, with potential applications in other engineering fields characterized by strong nonlinearity.

6/24/2024

cs.LG

Continuous Learned Primal Dual

Christina Runkel, Ander Biguri, Carola-Bibiane Schonlieb

Neural ordinary differential equations (Neural ODEs) propose the idea that a sequence of layers in a neural network is just a discretisation of an ODE, and thus can instead be directly modelled by a parameterised ODE. This idea has had resounding success in the deep learning literature, with direct or indirect influence in many state of the art ideas, such as diffusion models or time dependant models. Recently, a continuous version of the U-net architecture has been proposed, showing increased performance over its discrete counterpart in many imaging applications and wrapped with theoretical guarantees around its performance and robustness. In this work, we explore the use of Neural ODEs for learned inverse problems, in particular with the well-known Learned Primal Dual algorithm, and apply it to computed tomography (CT) reconstruction.

5/7/2024

cs.LG eess.IV