Learning from Integral Losses in Physics Informed Neural Networks

2305.17387

Published 6/12/2024 by Ehsan Saleh, Saba Ghaffari, Timothy Bretl, Luke Olson, Matthew West

🧠

Abstract

This work proposes a solution for the problem of training physics-informed networks under partial integro-differential equations. These equations require an infinite or a large number of neural evaluations to construct a single residual for training. As a result, accurate evaluation may be impractical, and we show that naive approximations at replacing these integrals with unbiased estimates lead to biased loss functions and solutions. To overcome this bias, we investigate three types of potential solutions: the deterministic sampling approaches, the double-sampling trick, and the delayed target method. We consider three classes of PDEs for benchmarking; one defining Poisson problems with singular charges and weak solutions of up to 10 dimensions, another involving weak solutions on electro-magnetic fields and a Maxwell equation, and a third one defining a Smoluchowski coagulation problem. Our numerical results confirm the existence of the aforementioned bias in practice and also show that our proposed delayed target approach can lead to accurate solutions with comparable quality to ones estimated with a large sample size integral. Our implementation is open-source and available at https://github.com/ehsansaleh/btspinn.

Create account to get full access

Overview

This paper proposes a solution for the challenge of training physics-informed neural networks under partial integro-differential equations.
These equations require a large number of neural network evaluations to construct a single training residual, making accurate evaluation impractical.
The paper investigates three potential solutions to address the bias introduced by naive approximations: deterministic sampling, the double-sampling trick, and the delayed target method.
The proposed methods are benchmarked on three classes of partial differential equations (PDEs), including Poisson problems, electro-magnetic fields, and Smoluchowski coagulation problems.

Plain English Explanation

Partial integro-differential equations are a type of mathematical model that combine differential equations and integrals. These models are often used in physics to describe complex phenomena. However, training neural networks to solve these equations can be challenging because it requires evaluating the neural network a huge number of times to calculate a single training error.

This paper proposes three different approaches to address this problem. The first is called "deterministic sampling," which tries to find a more efficient way to evaluate the integrals. The second is the "double-sampling trick," which uses two sets of random samples to estimate the integrals. The third is the "delayed target method," which avoids the need for the integrals altogether.

The researchers tested these three methods on three different types of partial differential equations from physics, including problems related to electricity and magnetism, and chemical reactions. Their results show that the delayed target method can produce accurate solutions without the need for a large number of neural network evaluations, which is a significant improvement over the naive approaches.

Technical Explanation

The paper addresses the problem of training physics-informed neural networks on partial integro-differential equations. These equations require an infinite or large number of neural network evaluations to construct a single training residual, making accurate evaluation impractical.

The authors investigate three potential solutions to the bias introduced by naive approximations of the integrals:

Deterministic Sampling: This approach tries to find a more efficient deterministic way to evaluate the integrals, rather than relying on random sampling.
Double-Sampling Trick: This method uses two sets of random samples to estimate the integrals, which can help reduce the bias.
Delayed Target Method: This approach avoids the need for the integrals altogether by defining the training objective differently.

The paper benchmarks these methods on three classes of PDEs: Poisson problems with singular charges and weak solutions in up to 10 dimensions, problems involving weak solutions of electro-magnetic fields and a Maxwell equation, and a Smoluchowski coagulation problem.

The results confirm the existence of the bias issue in practice and show that the delayed target method can lead to accurate solutions with comparable quality to those estimated with a large sample size integral.

Critical Analysis

The paper presents a thorough investigation of the bias issue in training physics-informed neural networks on partial integro-differential equations and proposes three promising solutions.

One potential limitation of the research is that the benchmarking is limited to three specific classes of PDEs. It would be valuable to see how the proposed methods perform on a wider range of partial integro-differential equations, including those with more complex or nonlinear structures.

Additionally, the paper does not delve into the computational cost and runtime implications of the different methods. In practical applications, these factors may be just as important as the solution quality, so further analysis in this area would be beneficial.

It would also be interesting to see how the proposed techniques compare to other approaches for solving partial differential equations, such as Bayesian methods or constrained neural networks.

Overall, this paper makes a valuable contribution to the field of physics-informed machine learning by addressing an important practical challenge. The proposed solutions show promise and warrant further investigation and testing.

Conclusion

This paper presents a novel approach to training physics-informed neural networks on partial integro-differential equations, which are challenging due to the large number of neural network evaluations required to construct a single training residual.

The researchers investigated three potential solutions to the bias introduced by naive approximations of the integrals: deterministic sampling, the double-sampling trick, and the delayed target method. Their benchmarking on several classes of PDEs showed that the delayed target method can produce accurate solutions without the need for a large number of neural network evaluations, a significant improvement over existing techniques.

The findings of this paper have important implications for the field of physics-informed machine learning, as they demonstrate a practical way to overcome a key challenge in applying neural networks to partial integro-differential equations. Further research is needed to explore the wider applicability of these methods and their computational efficiency, but this work represents an important step forward in this rapidly evolving area of study.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Solving partial differential equations with sampled neural networks

Chinmay Datar, Taniya Kapoor, Abhishek Chandra, Qing Sun, Iryna Burak, Erik Lien Bolager, Anna Veselovska, Massimo Fornasier, Felix Dietrich

Approximation of solutions to partial differential equations (PDE) is an important problem in computational science and engineering. Using neural networks as an ansatz for the solution has proven a challenge in terms of training time and approximation accuracy. In this contribution, we discuss how sampling the hidden weights and biases of the ansatz network from data-agnostic and data-dependent probability distributions allows us to progress on both challenges. In most examples, the random sampling schemes outperform iterative, gradient-based optimization of physics-informed neural networks regarding training time and accuracy by several orders of magnitude. For time-dependent PDE, we construct neural basis functions only in the spatial domain and then solve the associated ordinary differential equation with classical methods from scientific computing over a long time horizon. This alleviates one of the greatest challenges for neural PDE solvers because it does not require us to parameterize the solution in time. For second-order elliptic PDE in Barron spaces, we prove the existence of sampled networks with $L^2$ convergence to the solution. We demonstrate our approach on several time-dependent and static PDEs. We also illustrate how sampled networks can effectively solve inverse problems in this setting. Benefits compared to common numerical schemes include spectral convergence and mesh-free construction of basis functions.

6/3/2024

cs.LG cs.NA

🛸

PICL: Physics Informed Contrastive Learning for Partial Differential Equations

Cooper Lorsung, Amir Barati Farimani

Neural operators have recently grown in popularity as Partial Differential Equation (PDE) surrogate models. Learning solution functionals, rather than functions, has proven to be a powerful approach to calculate fast, accurate solutions to complex PDEs. While much work has been done evaluating neural operator performance on a wide variety of surrogate modeling tasks, these works normally evaluate performance on a single equation at a time. In this work, we develop a novel contrastive pretraining framework utilizing Generalized Contrastive Loss that improves neural operator generalization across multiple governing equations simultaneously. Governing equation coefficients are used to measure ground-truth similarity between systems. A combination of physics-informed system evolution and latent-space model output are anchored to input data and used in our distance function. We find that physics-informed contrastive pretraining improves accuracy for the Fourier Neural Operator in fixed-future and autoregressive rollout tasks for the 1D and 2D Heat, Burgers', and linear advection equations.

6/18/2024

cs.LG cs.NA

🛠️

Bengining overfitting in Fixed Dimension via Physics-Informed Learning with Smooth Iductive Bias

Honam Wong, Wendao Wu, Fanghui Liu, Yiping Lu

Recent advances in machine learning have inspired a surge of research into reconstructing specific quantities of interest from measurements that comply with certain physical laws. These efforts focus on inverse problems that are governed by partial differential equations (PDEs). In this work, we develop an asymptotic Sobolev norm learning curve for kernel ridge(less) regression when addressing (elliptical) linear inverse problems. Our results show that the PDE operators in the inverse problem can stabilize the variance and even behave benign overfitting for fixed-dimensional problems, exhibiting different behaviors from regression problems. Besides, our investigation also demonstrates the impact of various inductive biases introduced by minimizing different Sobolev norms as a form of implicit regularization. For the regularized least squares estimator, we find that all considered inductive biases can achieve the optimal convergence rate, provided the regularization parameter is appropriately chosen. The convergence rate is actually independent to the choice of (smooth enough) inductive bias for both ridge and ridgeless regression. Surprisingly, our smoothness requirement recovered the condition found in Bayesian setting and extend the conclusion to the minimum norm interpolation estimators.

6/18/2024

stat.ML cs.IT cs.LG cs.NA

Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations

Nima Hosseini Dashtbayaz, Ghazal Farhani, Boyu Wang, Charles X. Ling

The residual loss in Physics-Informed Neural Networks (PINNs) alters the simple recursive relation of layers in a feed-forward neural network by applying a differential operator, resulting in a loss landscape that is inherently different from those of common supervised problems. Therefore, relying on the existing theory leads to unjustified design choices and suboptimal performance. In this work, we analyze the residual loss by studying its characteristics at critical points to find the conditions that result in effective training of PINNs. Specifically, we first show that under certain conditions, the residual loss of PINNs can be globally minimized by a wide neural network. Furthermore, our analysis also reveals that an activation function with well-behaved high-order derivatives plays a crucial role in minimizing the residual loss. In particular, to solve a $k$-th order PDE, the $k$-th derivative of the activation function should be bijective. The established theory paves the way for designing and choosing effective activation functions for PINNs and explains why periodic activations have shown promising performance in certain cases. Finally, we verify our findings by conducting a set of experiments on several PDEs. Our code is publicly available at https://github.com/nimahsn/pinns_tf2.

6/14/2024

cs.LG