### Physical background

In this work we will focus on a biomedical problem, the flow inside blood vessels with stenosis leading to pathological blood flow. These vascular stenoses could lead to major health complaints, especially in the field of cardiology or neurology. The anatomy of stenosed vessels is highly individual and can be considered as non-parametric geometries. The blood can be described with the conservation equations of fluid mechanics, in particular with the Navier–Stokes equations. We want to point out that the proposed workflow could also be adapted to problems arising from other fields in continuum mechanics such as structural mechanics or heat and mass transfer.

The Navier–Stokes equations can be written in its residual formulation like below:

$$ {\mathcal{F}}\left( {{\mathbf{U}},{\text{p}}} \right) = 0: = \left\{ {\begin{array}{*{20}ll} {\nabla \cdot {\mathbf{U}} = 0,\user2{ }} & {{\mathbf{X}},t \in {{\varvec{\Omega}}}_{f,t} {{\varvec{\Theta}}} \in {\mathbb{R}}^{{\varvec{d}}} ,} \\ {\frac{{\partial {\mathbf{U}}}}{\partial t} + \left( {{\mathbf{U}} \cdot \nabla } \right){\mathbf{U}} + \frac{1}{\rho }\nabla {\text{p}} - \upsilon \nabla^{2} {\mathbf{U}} + {\mathbf{b}}f = 0,} & {{\mathbf{X}},t \in {{\varvec{\Omega}}}_{f,t} {{\varvec{\Theta}}} \in {\mathbb{R}}^{{\varvec{d}}} } \\ \end{array} } \right. $$

(1)

With **X** being spatial coordinates and t the time. The PDE is constraint due to \({{\varvec{\Theta}}}\) describing parameters like boundary and initial conditions and fluid properties. In the formula \({\mathbf{U}}\left( {{\text{t}},{\mathbf{X}},{{\varvec{\Theta}}}} \right)\) is the fluid velocity, with its components u, v, w, in three dimensions and \({\text{p}}\left( {{\text{t}},{\mathbf{X}},{{\varvec{\Theta}}}} \right)\) is the corresponding pressure. \({\uprho }\) and \({\upupsilon }\) are the density and the kinematic viscosity of the fluid respectively. The method has been tested for 2-dimensional steady state cases.

### GAPINN framework

The GAPINN framework consists of three separate networks, see Fig. 1: (1) as one of the most important parts, to solve for varying non-parametric geometries, a Shape Encoding Network (SEN); (2) a Physics Informed Neural Network (PINN) in order to solve the differential equation of a given fluid mechanical problem; (3) and a Boundary Constrain Network (BCN) to constrain the boundary and initial conditions for each given non-parametric geometric boundaries.

The shape of a fluid domain, where the Navier–Stokes equations to be solved, was defined by spatial positions **X**_{j,(i,b)} whereas the subscript i denotes locations of the internal field of the fluid domain, b denotes locations on the boundary and j indicates different cases defined by a set of varying non-parametric geometries of fluid domains (made up from i and b). The framework outputs velocities **U**_{j,(i,b)} and pressure p_{j,(i,b)} for this spatial locations. In this framework the SEN was used to reduce the dimensions of the given non-parametric geometric boundaries. This latent representation was than concatenated with the spatial positions and serves as input for the PINN and the BCN, see Fig. 1.

The workflow for training the GAPINN framework was that way that the SEN was trained first, generating a latent representation. The BCN was trained second and the PINN trained last, given the information’s from the SEN and BCN as depicted in Fig. 1. After the training of SEN and BCN was performed, no more adjustment of weights and biases were done for these networks.

We first describe the SEN more in detail to help the reader to understand how different fluid domain geometries can be interpreted by a PINN. Moreover, how this facilitates the development of a surrogate model that is able to solve fluid mechanical partial differential equations for various non-parametric geometries without the need of training data.

### Shape encoding network (SEN)

As input we assumed a non-parametric but well-defined fluid domain. We aimed to get a latent representation of each geometric shape, by using the technique of Variational-Auto-Encoder (VAE) [9]. VAE are a common technique in the field of computer vision to reduce high dimensional information to a lower dimensional representation in an unsupervised learning process. A VAE is built from two main components namely the encoder, which actually reduces the dimensions and the decoder, which reconstructs the input from a lower dimensional representation.

A reason why we recommend VAE instead of Auto-Encoder (AE) is that we had found poor validation performance for AE. In order to obtain a feasible latent representation, in terms of interpolation capabilities for geometric similar shapes, we used a VAE with a regularization term which fits the latent vector to a known distribution.

To ensure that the low dimensional representation is robust against permutation we had chosen a PointNet-like architecture [10]. The main concept of this type of network is the usage of Multilayer Perceptron neural networks with shared weights and a globally acting “symmetrical” pooling function in order to construct the lower dimensional representation from a set of points. For this study we used one dimensional convolution operators with a kernel size of 1 and stride of 1, which in principle is an implementation of a multilayer perceptron with shared weights. A schematic depiction of the encoder network used in the experiments is shown in Fig. 2. The input of the network was a subset of points representing the boundary (**X**_{j,b} with size n_{b}). Followed by four 1d convolution layers which were increasing the spatial dimensional axis from d to 512 channels (Conv1:128, Conv2:128, Conv3:256, Conv4:512). Subsequently, a max pooling operation was used in a channel wise manner to aggregate information shared by all points, resulting in an array with 512 channels. This was followed by two fully connected neural networks (FCNN) with one hidden layer. The FCNN reduced the pooled vector to the desired lower dimensional representation size n_{k} of the latent vector **k**. The pooled vector was split into prediction of mean and variance. This describes the posterior distribution predicted by the encoder. One can sub sequentially follow next steps by using either both vectors (mean vector and variance) or only the mean vector. For our cases applying the mean vectors reaches proper results.

During training the encoder was constrained to learn a given distribution (in our experiments this was a Gaussian distribution). From the output of the encoder the decoder reconstructs the input point cloud. The decoder consisted of FCNN with three hidden layers. After training of the SEN the decoder was no longer needed and only the encoder module was of importance. The reconstruction and the real input were compared using the chamfers distance function. During training the summation of the Kullback–Leibler divergence [11], between the approximate and true prior distribution, (\({\mathcal{L}}_{KL}\)) weighted with \(\beta\), and the chamfer distance loss (\({\mathcal{L}}_{REC}\)) [12], here called \({\mathcal{L}}_{VAE}\), were minimized.

$$ {\mathcal{L}}_{VAE} = {\mathcal{L}}_{REC} - \beta {\mathcal{L}}_{KL} $$

(2)

With \({\mathcal{L}}_{REC}\) described by following formula:

$$ {\mathcal{L}}_{REC} = \mathop \sum \limits_{{S \in X_{b} }} \mathop {\min }\limits_{{\hat{S} \in \widehat{{X_{b} }}}} \left\| {S - \hat{S}} \right\|_{2}^{2} + \mathop \sum \limits_{{\hat{S} \in \widehat{{X_{b} }}}} \mathop {\min }\limits_{{S \in X_{b} }} \left\| {S - \hat{S}} \right\|_{2}^{2} $$

(3)

With \(S\) being points from \({\mathbf{X}}_{b}\) and their reconstruction \(\hat{S}\) from \(\widehat{{{\mathbf{X}}_{b} }}\) due to the decoder. That means that \({\mathcal{L}}_{REC}\) describes the distance between a point from the subset **X**_{b} and its nearest neighbor in the reconstructed \(\widehat{{{\mathbf{X}}_{b} }}\); and the other way around.

### Physics informed neural network (PINN)

The PINN was of fully connected feedforward type (FCNN). Input to the PINN were on the one hand spatial positions **X**_{j,(i,b)} and on the other hand the encoded information **k** for each of the geometry domains. These two inputs were concatenated along the dimensional axis of **X** before inserted into the network. The dimension of the input is [n,d + n_{k}], with n being the number of points from a subset of **X**_{j,i} and **X**_{j,b}, d the number of spatial dimensions and n_{k} being the dimension of the latent vector **k**. The network maps the inputs to the velocity **U**_{j,(i,b)} and pressure p_{j,(i,b)}. For more general information on feed forward neural networks we recommend [13] or for more specific aspects regarding PINN’s [6, 14]. Layers between the input and output were referred to as hidden layers. As activation function after each hidden layer we used a sigmoid linear unit function [15], with the output coming from logistic sigmoid of the input multiplied with the input itself.

During the training the weights (**W**) and biases (**b**) were adapted so that the loss will be minimal, see Eq. 5. The loss function is describing the physics, in form of a differential equation in its residual formulation. In the case of steady state Navier–Stokes equations the loss is described due to the conservation of mass and momentum in the fluid domain, see Eq. 4.

$$ {\mathcal{L}}_{phy} \left( {{\mathbf{W}},{\mathbf{b}}} \right) = \left\| {\nabla {\mathbf{U}}^{2} } \right\| + \left\| {\left( {{\mathbf{U}} \cdot \nabla } \right){\mathbf{U}} + \frac{1}{\rho }\nabla p - \vartheta \nabla^{2} {\mathbf{U}}} \right\| ^{2} $$

(4)

$$ W^{*} = \arg \mathop {\min }\limits_{W} {\mathcal{L}}_{phy} \left( {{\mathbf{W}},{\mathbf{b}}} \right) $$

(5)

For this loss function several first and second order derivatives of the output (**U**,p) with respect to spatial coordinates **X** were needed. These calculations were performed be means of the concept of automatic differentiation (AD). AD is a common technique used in the field of machine-learning, mainly to get gradients of the network with respect to the weights and biases. This technique relies on the concept of the calculation of derivatives inside a computational graph and is implemented in most state of the art deep-learning libraries such as TensorFlow, PyTorch or Theano; here we worked with PyTorch. For solving the optimization problem (Eq. 5), we used the Adam algorithm [16].

### Boundary enforcing

Boundary conditions can be imposed mainly in two ways. First, by adding an extra penalty loss term \({\mathcal{L}}_{soft}\) to Eq. 4, which affects the PINN to learn conditions on the boundary, by minimizing \({\mathcal{L}}\left( {{\mathbf{W}},{\mathbf{b}}} \right)\) during training, with W and b being weights and biases of the neural network see Eq. 6. Sun et al. showed several major drawbacks of this so-called soft boundary imposing. As mentioned by Sun et al. this approach is not feasible to ensure the accurate definition of initial and boundary conditions due to its implicit manner. Furthermore, the optimization performance could depend on relative importance of the boundary loss term and PDE loss term [2]. This could be addressed by weighting the terms, but also a-prior weighting will mostly not be known.

$$ {\mathcal{L}}\left( {{\mathbf{W}},{\mathbf{b}}} \right) = {\mathcal{L}}_{phy} + {\mathcal{L}}_{soft} $$

(6)

The second approach of imposing boundary conditions is to hard enforce these in the neural network. This approach can be implemented by constructing a set of functions that are taking the outputs of the PINN \({\hat{\mathbf{U}}}_{{{\text{b}},{\text{i}}}}\) and \({\hat{\text{p}}}_{{{\text{b}},{\text{i}}}}\) as input and, while automatically satisfying boundary conditions, computing the output \({\mathbf{U}}_{{{\text{b}},{\text{i}}}}\) and \({\text{p}}_{{\text{b}}}\), see Eq. 7. It is beneficial that these functions are “smooth” on \({\mathbf{X}}_{{{\text{j}},\left( {{\text{i}},{\text{b}}} \right)}}\). For most boundary problems we can use functions that indicate where a boundary condition should be constraint, here referred to as \(B\left( {{\hat{\text{U}}}_{b,i} ,{\hat{\text{p}}}_{b,i} } \right)\)_{.} In addition to that functions for applying the correct value on the boundary, here expressed as \({\varvec{C}}\left( {{\mathbf{X}}_{{{\text{j}},\left( {{\text{i}},{\text{b}}} \right)}} } \right)\), need to be set. This concept is motivated according to [17].

$$ \begin{array}{*{20}c} {{\mathbf{U}}_{{j\left( {b,i} \right)}} } \\ {{\text{p}}_{{j\left( {b,i} \right)}} } \\ \end{array} = B\left( {\hat{\user2{U}}_{{{\text{j}},\left( {{\text{i}},{\text{b}}} \right)}} ,{\hat{\text{p}}}_{{{\text{j}},\left( {{\text{i}},{\text{b}}} \right)}} ,{\mathbf{X}}_{{{\text{j}},\left( {{\text{i}},{\text{b}}} \right)}} } \right) + {\varvec{C}}\left( {{\mathbf{X}}_{{{\text{j}},\left( {{\text{i}},{\text{b}}} \right)}} } \right) $$

(7)

For hard constrain boundary conditions for common fluid mechanic domains we propose to use a pre-trained neural network which is predicting the minimal Euclidean distance of each \({\mathbf{X}}_{{{\text{j}},{\text{i}}}}\) to the walls which are considered fixed, here indicated as the function BCN\(\left( {{\mathbf{X}}_{{{\text{j}},\left( {{\text{i}},{\text{b}}} \right)}} } \right)\). We used a simple FCNN for this task. The advantage of this approach lies in the capability of tracking gradients in order to use the concept of automatic differentiation during neural network training. Therefore we did not compute the distance on the fly while training.

The network takes as input the spatial positions \({\mathbf{X}}_{{{\text{j}},\left( {{\text{i}},{\text{b}}} \right)}}\) and the latent vector **k**. The prediction of BCN was compared to pre-computed Euclidean distances of the spatial positions to the boundaries with fixed zero velocity by using mean squared error. Reduction of the mean squared error was done in the training process of the BCN, by adapting the weights of the neural network. The exact form of the functions B and C are highly dependent on the investigated fluid mechanical problem to be solved. For detailed description on how to construct the boundary functions for a specific problem we refer to the following experiment chapter.