Universal Differential Equations (UDEs) are a class of models that combine mechanistic knowledge (e.g., known physical laws) with data-driven components (e.g., neural networks) to model complex systems.
If some parts of the dynamics are known, we can encode that knowledge in the structure of the ODEs and use any universal approximator (e.g., neural network) to model the unknown parts of the dynamics.
We focus on neural networks as the universal approximator in this lecture, but the same principles apply to other types of universal approximators.
We can either start from a neural ODE and add mechanistic terms, or start from a mechanistic model and add neural network terms to capture unknown dynamics.
\[ \begin{aligned} \frac{d\text{Depot}}{dt} & = \text{NN}(\text{Depot}, \text{Central}, R)[1] \\ \frac{d\text{Central}}{dt} & = \text{NN}(\text{Depot}, \text{Central}, R)[2] \\ \frac{dR}{dt} & = \text{NN}(\text{Depot}, \text{Central}, R)[3] \end{aligned} \]
\[ \begin{aligned} \frac{d\text{Depot}}{dt} & = -\text{NN}_1(\text{Depot}) \\ \frac{d\text{Central}}{dt} & = \text{NN}_1(\text{Depot}) - \text{NN}_2(\text{Central}) \\ \frac{dR}{dt} & = \text{NN}_3(\text{Central}, R) \end{aligned} \]
\[ \begin{aligned} \frac{d\text{Depot}}{dt} & = -K_a \cdot \text{Depot} \\ \frac{d\text{Central}}{dt} & = K_a \cdot \text{Depot} - \frac{\text{CL}}{V_c} \cdot \text{Central} \\ \frac{dR}{dt} & = \text{NN} \Bigg( \frac{\text{Central}}{V_c}, R \Bigg) \end{aligned} \]
\[ \begin{aligned} \frac{d\text{Depot}}{dt} & = -K_a \cdot \text{Depot} \\ \frac{d\text{Central}}{dt} & = K_a \cdot \text{Depot} - \frac{\text{CL}}{V_c} \cdot \text{Central} \\ \frac{dR}{dt} & = k_\text{in} \cdot \Bigg( 1 + \text{NN}\Bigg( \frac{\text{Central}}{V_c} \Bigg) \Bigg) - k_\text{out} \cdot R \end{aligned} \]
Training PINNs involves differentiating the loss function with respect to the parameters of the neural network to get the gradient for optimization.
If the dynamics of the system are not fully known, they can be described using a UDE while still using the PINN loss.
The ODE/PDE/UDE parameters can be learned together with the neural network weights representing the solution.
The constraints can be enforced on a subset of the compartments if the dynamics of some compartments are better understood than others.
The known PDE/ODE may not be sufficient to fully describe the data observed.
PINNs are typically more data efficient than purely data-driven models.
PINNs are easy to implement and are easily suited for GPUs.
Even if the dynamics are well understood, PINNs can often be used to avoid solving the ODE/PDE directly using classical numerical methods. However, this is generally not recommended because:
The solution is sensitive to the values of the penalty coefficients \(\lambda\).
It can be difficult to find good values for \(\lambda\), especially if the constraints are on different scales.
Despite the limitations of PINNs, when learning the parameters of a model from data (inverse problem), an accurate solution may not be necessary at the beginning of training.
PINNs can learn to approximate the solution while learning the parameters simultaneously.
In some (but not all) cases, this may be more efficient than repeatedly solving the ODE/PDE with different parameters using an accurate numerical solver.
Solving inverse problems with gradient-based algorithms also requires differentiating through the numerical solver.
The PINN solution can be used to initialize a more accurate algorithm using more accurate but slower numerical algorithms and gradient calculation.
A DeepONet typically consists of two main components:
Branch network:
Trunk network: