In this section, we present our Bayesian calibration approach for a coupled fluid–structure interaction problem. We demonstrate the approach with generated data to better assess its individual steps. Our focus lies on the presentation of the proposed approach and some general characteristics in the resulting posteriors. In application with real-world data, the procedure can be used without any changes. The computational mechanics models in the examples are schematic models for fluid-biofilm interaction that is the motivation for our research. Therein the fluid–solid interface deforms as a consequence of the interaction. A further description of the experiments is moved to the appendix as we want to focus on the model here. In the following examples, we calibrate biofilm material properties under partially uncertain experimental conditions. For the numerical demonstrations we use the fluid-solid interaction (FSI) between incompressible Navier–Stokes flow and a hyperelastic nonlinear solid material model which is briefly introduced in the appendix. Although the presented calibration approach is equivalently applicable for single field problems with deformable boundaries we take the challenge of a coupled multi-physics model of FSI because we want to advertise the benefit of the approach in such applications. A variety of different models for biofilms are available and also further effects can be included (see, e.g., [12, 51]), which would just lead to different forward models.

### Problem setup

The calibration is performed for hyperelastic material properties of the solid domain for which we will calibrate the two parameters of a Saint-Venant Kirchhoff material model. For the given setup of FSI models for biofilms and the biofilm flow cell data, the location of the fluid-biofilm interface is the primary data available and therefore used for comparison. A schematic sketch of the problem setup is drawn in Fig. 2. For easier demonstration we first investigate a two-dimensional calibration problem (\(\dim (\varvec{x})=2\)) without experimental uncertainties (\(\dim (\varvec{\theta })=0\)) and then move on to more complex examples.

We chose the same problem setup as presented in [12]. The biofilm geometry is inspired by analyses done on experimental results in [11, 14]. The model domain represents a two-dimensional channel with dimensions \(1\,\textrm{mm}\times 2\,\textrm{mm}\), where horizontal fluid flow with parabolic profile is enforced from the left boundary (see \(\Gamma ^{\textrm{F}}_\textrm{in}\) in Fig. 2) with a maximal volume rate of \( \dot{V}_{\textrm{in}}= 100\,{\mathrm {mm^2}/\textrm{s}} \). The solid biofilm (green) is attached to the channel floor. A no-slip condition for the fluid is used on the channel floor and top boundary as well as the biofilm on the channel floor (see \(\Gamma ^{\textrm{F}}_\textrm{D}\) and \(\Gamma ^{\textrm{S}}_\textrm{D}\) in Fig. 2) and a horizontal outflow is enforced on the right edge (see \(\Gamma ^{\textrm{F}}_\textrm{out}\) in Fig. 2). These boundary conditions are modeled via Dirichlet boundary conditions on the fluid velocity (and solid displacement accordingly) and respective free outflow in horizontal direction. The horizontal condition represents the ongoing empty channel left and right from the modeled domain.

To generate the artificial experimental data, we use a forward simulation of the fluid-biofilm interaction with a Saint-Venant-Kirchhoff material model for the solid biofilm domain. The material is characterized by two parameters, namely Young’s modulus \(E\) and Poisson’s ratio \(\nu \). We summarize the chosen input parameters, used for the generation of the data, in form of the ground truth vector \({\varvec{x}}_{\textrm{gt}}=\begin{bmatrix}\nu = 0.3, E= 400\,\textrm{Pa}\end{bmatrix}^{\textsf{T}}\). The fluid has a dynamic viscosity of \(\mu ^{{\textrm{F}}}= 10^{-3}\,\mathrm {Pa\, s} \) and a density of \( \rho ^{{\textrm{F}}}= 10^3\,{\textrm{kg}/\mathrm {m^3}} \) as a model for water. The biofilm has the same density as the fluid. The solution of the velocity and pressure field of the fluid and displacement field of the biofilm is depicted in Fig. 3 for the regarded quasi-steady deformed state in the ground truth forward model evaluation. As a reaction to the load imposed by the fluid inflow boundary condition, the solid bends towards the right as it is plotted in Fig. 4. In Fig. 4a we see the artificial observation data \(Y_{\textrm{obs},C}\) as the result of the reference simulation with ground truth values \({\varvec{x}}_{\textrm{gt}}\). \(Y_{\textrm{obs},C}\) represents the deformed location (green) of the interface in this example for one single point in time \(C\). In Fig. 4b we plot the results for some exemplary parameter combinations additionally.

### Likelihood response surface for different discrepancy measures

In a first step, we compare the effect of the different discrepancy measures as introduced. We directly approximate the log-likelihood by the posterior mean function of a Gaussian process surrogate and use the discussed discrepancy measures. Additionally, we also provide the posterior standard deviation of the GP to quantify the remaining uncertainty in the surrogate. All surrogate models for the log-likelihood function resulting for the different discrepancy measures use the same training input that was sampled with a quasi-random Sobol sequence to yield a space-filling training design. The training inputs are different parameters in the forward model and the resulting forward simulations.

For an estimation of a suitable \( \sigma _{\textrm{N}}\) in the likelihood model (4) the known resolution \(\approx 8 \,\mathrm {\mu m} \) of OCT [14, 15] is considered. The noise standard deviation is assumed to be in the same order of magnitude, such that \(\sigma _{\textrm{N}}= 0.01\,\textrm{mm}\) is used in the following. OCT resolution and standard deviation in the likelihood model are not expected to be equal. A further discussion is omitted here, as the demonstration here works independent of the choice and an expressive value can only be found related to real data and chosen image segmentation. For the RKHS norm we need two parameters \(\sigma _{\textrm{N}}\) and \(\sigma _\textrm{W}\). An estimation of the length scale \(\sigma _\textrm{W}\) for the RBF kernel in the RKHS approach in (10b) is made as \(\sigma _\textrm{W}= 0.005\,\textrm{mm}\), being approximately \(10\%\) of the maximal displacement magnitude (see Fig. 3(a)).

The resulting regression models for the likelihoods are shown in Fig. 5 for \(n_\textrm{train}= 1000\) training points with a Matérn 3/2 kernel (see appendix on GP, Eq. (27) for details). In general, the likelihood over the parameters can be understood as a score value of how well the forward model response with respective parameters leads to similar results compared to the reference data. A discussion and interpretation of the figure will follow below. For this first comparison, we just use this large number of training points and postpone the discussion on the GP convergence over \(n_\textrm{train}\) to the case only using the RKHS norm measure. For the distribution of the measurement points in this comparison, the reader is referred to [12]. In Fig. 5 the fields of the likelihoods are determined from the logarithmic likelihoods and those fields are normalized in the plots for better comparability. Parameter combinations that led to failed forward model evaluations are marked by gray crosses in the following figures. For our setup, the failing simulations occurred for low values of Young’s modulus (located on the left side of the plots) representing soft biofilm material, which lead to large mesh distortions in the ALE FSI approach. For the sake of comparability, the respective logarithmic likelihoods are plotted in Fig. 6 and the associated standard deviations in the regression models in Fig. 7.

It can be seen, that the response in the likelihoods show a high likelihood of the parameters in red for a characteristic curved shape with its peak at the expected ground truth parameters \({\varvec{x}}_{\textrm{gt}}=\begin{bmatrix}\nu = 0.3,&E= 400\,\textrm{Pa}\end{bmatrix}^{\textsf{T}}\). A high likelihood corresponds to a high probability density of the parameter combination to represent the reference data. This means that the corresponding values for \({\varvec{x}}\) in this region of the input space result in simulation outputs that are very close to the observation data under the employed discrepancy measure. The likelihood falls very close to zero when moving away from the high likelihood regions, representing low similarity of the forward model result compared to the reference data. Especially the failed simulations (gray crosses) fall in a region with very low likelihood values which renders them irrelevant for our investigations. For the rest of the input space, it can be seen in Fig. 7 that the standard deviation in the regression model is low and only rises in regions with little data.

Comparing the Euclidean distance measure in Fig. 5a and the closest point projection in Fig. 5b it can be seen, that the second one is more peaked around the maximum of the likelihood that is close to the ground truth value \({\varvec{x}}_{\textrm{gt}}\). This is also a consequence of the formulation (4) and (5), as the numbers of distance measurements differ greatly with the number of measurement points \({n_\mathrm {\textrm{mp}}}=10 \) and the number of interface nodes \({n_\textrm{in}}=66\). The higher number of individual single point measurements generally leads to a more peaked likelihood with the same \(\sigma _{\textrm{N}}\). It must also be stated that no further weighting of the closest point projection distances was done, so all nodal closest point projection distances are considered equally important. This includes the distances for many mesh nodes, that are close to the boundary conditions and therefore there the displacement magnitude is lower. An additional data compression approach, e.g., kernel principal component analysis [21, 52] or also a selection of only a subset of the interface nodes could be used to increase comparability, but this is outside of the scope of the current article. The likelihood from the RKHS based distance measure (in Fig. 5c) combines both features to be expressive around the maximum likelihood (ML) point and still have information from more distant points. The RKHS norm measure correlates all discretized interface locations and orientations of the model to all counterparts in the observation (see (11)), therefore it is also expected to be the most detailed measure for this comparison.

The likelihood based on the forward model evaluations adequately combined with prior distributions results in the posterior (see (1)), which is a probability density of the parameters leading to the observations. It is favorable to have an expressive posterior and therefore expressive likelihood as a result to be able to compare posterior values for different parameter combinations and with that develop an understanding of the forward model in relation to the observations. In the case of a very flat likelihood the conclusion is that all parameter combinations are similarly well suited to explain the observations and therefore they are potentially insignificant to the forward model, at least in the range of the regarded parameter intervals. This means that all input parameters \({\varvec{x}}\) will lead to forward model results that are very close to the observation under the employed discrepancy measure. The expressive shape of the likelihood based on the RKHS based measure underlines that this measure is well suitable in the given example to be used for Bayesian calibration. Therefore it is used in all following examples.

The presented likelihoods were generated using the different discrepancy measures (6), (7) and (11) and all yield similar characteristics in the shape of the likelihood. Therefore it can be concluded that in this setting and a fluid-biofilm interface-based measurement of a flow cell experiment an underestimation of Young’s modulus \(E\) is coupled to an underestimation of Poisson’s ratio \(\nu \). In applications with real experimental data, it would make sense to restrict the training points to the intervals, that are believed to contain the optimum or at least relevant values for the parameters. That could mean to restrict the Poisson’s ratio to positive values, as this is what would be expected for most materials. Nevertheless, the resulting shapes representing high likelihood are very smooth for the whole tested range between \(\nu = \)-0.8 and \(\nu = 0.5\) and do not show a distinct border between positive and negative values. This is interesting with regard to biofilm mechanics in flow cell experiments as it shows that the estimation of \(E\) and \(\nu \) is coupled for all surface comparisons that were tested. For some investigations, this also hints toward the need to incorporate additional measurements or information, e.g., prior knowledge.

### Convergence over number of training points

The most costly part of the presented algorithm are the forward model evaluations that are necessary to generate the training data \(\mathcal {D}\) of the GP log-likelihood regression model. While the convergence of the GP over the number of training data points is dependent on the character of the underlying function as well as on the selected design of experiments, we want to show a short qualitative convergence study for the problem at hand for the distance measure based on the RKHS inner product. In general the convergence study of a GP w.r.t. the number of training points is difficult as usually it is not feasible to increase the data point set size by orders of magnitude. Therefore other strategies, e.g., leave-one-out cross validation can be used as a proxy for the regression error [39]. The requirement towards the number of training points and the regression model is to catch the relevant shape characteristics of the likelihood.

For the application of such an approach, it is crucial to know how many forward model evaluations are required to get results efficiently. To get a first picture the Gaussian process regression model for the logarithmic likelihood is created for a series of training point set sizes \(n_\textrm{train}\). For this qualitative study, only the RKHS norm based likelihood was used as it uses the most detailed comparison and seems to give the most expressive shape of the posterior. The likelihood parameters were set to \(\sigma _\textrm{W}=0.005 \,\mathrm {\mu m}\), \(\sigma _{\textrm{N}}^2=0.0005\,\textrm{mm}^2\) in all following examples. The choice \(\sigma _{\textrm{N}}^2=0.0005\,\textrm{mm}^2 \) in our case relates to the discretization size, as the rounded squared average interface element length is \(\bar{l}^2 \approx 0.0005 \,\textrm{mm}^2 \). It is difficult to interpret \(\sigma _{\textrm{N}}\) in relation to the measurement error alone, as especially also the image segmentation approach chosen to determine the model fluid–biofilm interface from experimental data influences the choice. We deviated from this parameter choice for \(\sigma _{\textrm{N}}\) in the comparison of the measures to keep it the same for all measures.

We used a Matérn 2/3 kernel (see appendix on GP, Eq. (27)) with one length scale hyperparameter *l* (in the standardized input space) and one signal variance \(\sigma _0^2\) which will be determined via ML estimation using an L-BFGS-B optimizer for the evidence of the GP (see Eq. (29)). The Matèrn 2/3 kernel appeared to be the best suitable to represent the likelihood field as it resulted for the given example. Especially the great variety in steepness and therefore low smoothness of the likelihood seems to be the challenge for the regression approach. However, at the moment, no general recommendation for the GP kernel can be given. As commonly done [39] we used a fixed nugget noise \( \sigma _\textrm{n}^2=10^{-5}\cdot \left( \textrm{max}\left( \varvec{\mathfrak {L}}_\textrm{train}\right) -\textrm{min}\left( \varvec{\mathfrak {L}}_\textrm{train}\right) \right) \) to stabilize the training of the GP.

In the following, the convergence of the likelihood surrogate model over the number of used training points is qualitatively shown for the problem at hand. For this study, the Sobol sequence sampling property is used as all samples are consecutive samples from the same Sobol sequence. In Fig. 8 it is apparent, that only very few training points are necessary to estimate the general shape of the likelihood distribution and have a rough estimate for the optimum. With \(n_\textrm{train}=200\) samples the ML estimate is already in good agreement with the ground truth \({\varvec{x}}_{\textrm{gt}}=\begin{bmatrix}\nu = 0.3,&E= 400\,\textrm{Pa}\end{bmatrix}^{\textsf{T}}\).

It can be concluded that for this two-dimensional example the likelihood surrogate shows relevant features and the global shape well for \(n_\textrm{train}= 200 \) forward model evaluations and no significant gain in accuracy can be expected for a moderate increase of the sample size. Given that in the following examples a comparison with different priors is made and another problem dimension in form of an uncertain parameter is added, and therefore the complexity is increased, we use the Gaussian process model with one length scale for all (standardized) parameters, a fixed nugget noise and \(n_\textrm{train}=1000\) training points for the following examples (see Remark 6 on dimensionality).

### Remark 7

(Distribution of training points) With the applied approach of a grid-based quasi-random distribution of the samples, there is no compromise between exploration and exploitation, but the emphasis is put on exploration. It is possible to use available prior information for the generation of the samples and therefore have higher density of samples in high prior areas, or use an iterative approach and refine the samples in high posterior regions.

### Calibration of constitutive parameters in biofilm models

In this subsection, the regression model will be generated according to the findings in the previous examples. The GP was constructed with a Matérn 2/3 kernel, a single length scale and signal variance for all parameters and a fixed nugget noise variance \(\sigma _\textrm{n}^2\) in (29). 1000 samples were used for the training of the GP. As the likelihood model is available in form of a cheap to evaluate surrogate, we use 5000 SMC particles and 20 rejuvenation steps per SMC iteration (which would result in 1 million likelihood calls for 10 SMC iterations). For the adaptive step size of the SMC iterator a control parameter of \(\zeta = 0.995\) was used (see Algorithm 1).

In the following examples, we examine the influence of priors in the parameters and uncertainties on the resulting posteriors. In Fig. 9 all results discussed in this section are plotted on the same page for better comparability. The different cases will be described and discussed subsequently in the following paragraphs. Figure 9 displays the resulting posteriors of the examples in form of SMC particle approximations. To visualize the character of the data, one particle distribution is shown explicitly in Fig. 9b, where the particles are plotted at their respective coordinates. The particle weights are illustrated as circle size and in a color scale. The other three subplots (Fig. 9a, c, d) show two-dimensional hexagonal histograms in a color scale. The particles are sorted into the displayed bins and summed up using their weights. The percentiles of the two-dimensional posteriors are additionally simplified as kernel density estimates (KDE) that are shown as black solid lines. For the KDE a radial basis function (RBF) kernel with bandwidth optimization was used.

On the top and right sides, the marginal distributions depending on both individual input parameters are plotted as histograms. The one-dimensional marginal posterior distributions \(p\left( E|Y_{\textrm{obs},C}\right) \) and \(p\left( \nu |Y_{\textrm{obs},C}\right) \) can easily be approximated by sorting the weighted particles into bins in the respective dimension. The marginals are additionally displayed in form of KDEs as black solid lines. The used priors are indicated as red dashed lines in all following plots.

Characteristic points deduced from the particle approximation are marked as crosses. The maximum a posteriori (MAP) estimate \({\varvec{x}}_{\textrm{MAP}}\) is marked in green and the posterior mean (PM) \({\varvec{x}}_{\textrm{PM}}\) in orange.

In high dimensions, the posterior cannot be plotted as easily as in the two-dimensional case. The analyst wants to have a comparative overview of where the mean value of the posterior is and how the probability mass is distributed in the posterior. For that and as another common approximation, we also show percentile lines of the global Gaussian approximation to the posterior in orange. The global Gaussian approximation is parameterized by the posterior mean (PM) vector \({\varvec{x}}_{\textrm{PM}}\) and the covariance matrix which can both be calculated from the weighted SMC particles very quickly.

We also present the Laplace approximation [21] of the posterior distribution around the MAP estimate depicted in Fig. 9a and c with green solid lines for the two cases without uncertainty. The Laplace approximation can be understood as a local quadratic approximation of the posterior density function around its MAP in the log space in form of a Gaussian distribution. The covariance matrix of the resulting local Gaussian distribution gives an idea of how fast the posterior changes in a certain direction, starting from the MAP estimate. Please note, that the posterior distribution is not known in closed form but only approximated by SMC particles with associated weight. We approximate the MAP estimate by the SMC particle that scored the highest posterior value in the last rejuvenation step of the last iteration. In these examples, the necessary gradients of the posterior distribution were approximated by finite differences w.r.t. the input variables \({\varvec{x}}\) on the GP regression model. The Laplace approximation was omitted for the case with uncertain inflow. Laplace approximations are used for approximative Bayesian inverse analysis methods [6], where the MAP is first found through optimization and then the Lagrange approximation is computed for an estimate of the variance.

#### Influence of prior assumptions on the posterior distribution

To show the influence of quantifiable prior knowledge on the posterior, we discuss the same example with uniform prior and a combined beta-distribution and log-normal distribution prior on the model input \({\varvec{x}}\).

*Uniform prior* In this first demonstration we choose an uninformative uniform prior distribution for Young’s modulus \(p\left( E\right) =\mathcal {U}\left( E|100\,\textrm{Pa},800\,\textrm{Pa}\right) \) and for the Poisson ratio \(p\left( \nu \right) =\mathcal {U}\left( \nu |-0.8,0.5\right) \). Those are independent priors on the parameters and can be combined to \(p\left( E, \nu \right) = p\left( E\right) p\left( \nu \right) \).

Figure 9a displays the approximation of the resulting posterior in form of a hexagonal histogram plot generated with the weighted particles from the SMC run. In Fig. 9b the resulting particle distribution of the SMC along with the colored-coded particle weight is shown.

The posterior in Fig. 9a shows an almost linear band shape of high densities. This shows that it is more crucial to have a good value for Young’s modulus \(E\) than one for Poisson’s ratio \(\nu \) to obtain good similarity between model output and reference data for the given parameter ranges. The posterior mean (PM) vector computes to \({\varvec{x}}_{\textrm{PM}}=[E\approx 431\,\textrm{Pa},\; \nu \approx -0.0071]^{\textsf{T}}\). In Fig. 9a the Gaussian approximation has the same orientation as the particle approximation to the posterior. Still it can not represent the posterior complexity. The maximum a posteriori (MAP) is approximated to \({\varvec{x}}_{\textrm{MAP}}= [E\approx 388\,\textrm{Pa},\; \nu \approx 0.266]^{\textsf{T}}\). This \({\varvec{x}}_{\textrm{MAP}}\) is close to the ground truth in relation to the number of training points and the resulting resolution of training points in the input space.

Interestingly, the Laplace approximation, represented by its percentile contour lines in Fig. 9a in green, is almost oriented orthogonal to the actual posterior (solid black line). This means, that the Laplace approximation gives a misleading local approximation in this specific case. That can occur if the posterior has a more complex curvature around the MAP. In our example, this might be partially induced by the GP approximation, as well. It seems that the Laplace approximation cannot live up to the complexity of the given posterior and therefore simplified methods based on the Laplace approximation would work poorly in presented examples without the analyst knowing. This is why we advocate the fully Bayesian treatment for this kind of problems.

As mentioned before, we also plot the marginal posterior distributions of \(p\left( E|Y_{\textrm{obs},C}\right) \) and \(p\left( \nu |Y_{\textrm{obs},C}\right) \), respectively. The marginals show that \(p\left( E|Y_{\textrm{obs},C}\right) \) has a single, stable, global optimum and \(p\left( \nu |Y_{\textrm{obs},C}\right) \) forms a plateau of high densities. Due to the strong coupling of \(E\) and \(\nu \) and the complex shape of the joint posterior \(p\left( E,\nu |Y_{\textrm{obs},C}\right) \), the marginal posterior distributions alone however are not informative for the coupling effects in the global posterior distribution.

*Informed prior* Physical insight can be incorporated in prior assumptions and can have a great influence on the posterior distribution and should be integrated in the analysis. Besides the uninformative uniform prior that we used in the first example, we now also want to demonstrate the effect of an informed prior. For the Young’s modulus, a log-normal prior is assumed with a mode of \(E= 300 \,\textrm{Pa}\). The log-normal can be parameterized with \( \mathcal{L}\mathcal{N}\left( E|\mu _{{\mathcal {L}}{\mathcal {N}}}\approx 5.86,\sigma _{{\mathcal {L}}{\mathcal {N}}}= 0.4\right) \) with parameters \(\mu _{{\mathcal {L}}{\mathcal {N}}}, \sigma _{{\mathcal {L}}{\mathcal {N}}}\) that are not standard deviation and mean of the distribution, but \(\log {(E)}\sim \mathcal {N}\left( \mu _{{\mathcal {L}}{\mathcal {N}}},{\sigma _{{\mathcal {L}}{\mathcal {N}}}^2}\right) \). This accounts for the fact that the Young’s modulus must have positive values and it is very unlikely that it is close to zero but more probable to find values higher than the mode. For the Poisson’s ratio a beta-distribution \(\mathcal {B}\left( \nu |a=43/13,b=22/13\right) \) between \(-0.8\) and 0.5 is used as prior. This distribution has its mode at \(\nu = 0.2\) and accounts for the belief that the Poisson’s ratio is more probable to have positive values and is strongly bounded between \(-1.0\) and 0.5 with a decreasing probability towards the boundaries of this interval.

The modes of the presented priors are intentionally chosen to deviate from the ground truth \({\varvec{x}}_{\textrm{gt}}\) to show the effect of informative priors. The priors are plotted in Fig. 9c alongside the respective marginal posterior distributions. It can be easily observed that compared to the uniform priors used in Fig. 9a, the informed prior assumptions, i.e., distributions that weight specific areas in the input space higher than others, lead to a posterior distribution which is more pronounced around the ground truth \({\varvec{x}}_{\textrm{gt}}\) and has less probability mass in regions with low prior density. As the majority of the probability mass of the resulting posterior is now in a more compact area of the design space, also the marginal posterior distributions \(p\left( E|Y_{\textrm{obs},C}\right) \) and \(p\left( \nu |Y_{\textrm{obs},C}\right) \) show a more defined shape with one predominant mode.

Similar to Fig. 9a, the joint posterior \(p\left( E,\nu |Y_{\textrm{obs},C}\right) \) has more variance in the \(\nu \)-dimension, rendering \(\nu \) a *sloppy* parameter. This becomes also apparent as the marginal posterior distribution \(p\left( \nu |Y_{\textrm{obs},C}\right) \) almost coincides with the prior assumption \(p\left( \nu \right) \), indicating a very small influence of this parameter on the likelihood function. This also means that for this type of measurement the exact value of model parameter \(\nu \) has less importance for the agreement of mechanical model and observed experiment. In \(p\left( E|Y_{\textrm{obs},C}\right) \) on the other hand the density does not have the same mode as the prior \(p\left( E\right) \). This underlines that the likelihood contains characteristic information about this parameter.

The MAP estimate has the values \({\varvec{x}}_{\textrm{MAP}}= [E\approx 361\,\textrm{Pa},\; \nu \approx 0.195]^{\textsf{T}}\) in this case. This is slightly lower in both parameter values than in the run with uniform prior. With the prior modes deviating from the ground truth, a deviation of \({\varvec{x}}_{\textrm{MAP}}\) from the ground truth towards lower values is expected. Furthermore, we see that the global Gaussian approximation of the posterior around \({\varvec{x}}_{\textrm{PM}}= [E\approx 377\,\textrm{Pa},\; \nu \approx 0.08]^{\textsf{T}}\) moves closer to the mode of the posterior distribution, as the low likelihood areas are further weighted with low prior values and therefore lose probability mass compared to the uniform priors. The Laplace approximation is still not a very good local approximation of the posterior.

#### Material calibration under uncertain boundary conditions

Now we also demonstrate the treatment of additional uncertain influences on the system, denoted as \(\theta \) above. We use the example of an uncertain inflow volume rate in the flow cell experiment. As generally the biggest biofilm patches in the channel are analyzed, they take up a significant portion of the cross-section of the channel and force the flow to go around it. Further, only the middle of the channel can be scanned to high quality using OCT. Hence, the distribution of the volume flow rate between outlying parts of the cross-section and the analyzed patch is subject to uncertainties. Overall these considerations are summed into the assumption of an uncertain inflow rate distribution, that has its main mode significantly lower than the average inflow rate and a spread distribution around that, with low density for higher flow rates. This is modeled with an assumption of \(p\left( \dot{V}_{\textrm{in}}\right) \) as a beta-distribution \(\mathcal {B}\left( \dot{V}_{\textrm{in}}|a=2.6,b=1.4\right) \) between \(0 \,{\mathrm {mm^2}/\textrm{s}} \) and \( 110 \,{\mathrm {mm^2}/\textrm{s}} \) with mode at \(88 \,{\mathrm {mm^2}/\textrm{s}}\) which is plotted in Fig. 10. The value of the volume inflow rate used for data generation is kept the same as in all other examples \(\dot{V}_{\textrm{in}}= 100\,{\mathrm {mm^2}/\textrm{s}} \). The distribution accounts for the belief that only less than average of the fluid volume rate flows over the biofilm as compared to the rest of the channel. For the two material parameters uniform priors \(p\left( E\right) =\mathcal {U}\left( E|100\,\textrm{Pa},800\,\textrm{Pa}\right) \) and \(p\left( \nu \right) =\mathcal {U}\left( \nu |-0.8,0.5\right) \) were used.

The resulting joint posterior of the two analyzed parameters under the influence of the uncertain inflow are plotted in Fig. 9d. The posterior under uncertainty, denoted by \(q(E,\nu |Y_{\textrm{obs},C})={\mathbb {E}}_{\dot{V}_{\textrm{in}}}\left[ p\left( E,\nu |\dot{V}_{\textrm{in}},Y_{\textrm{obs},C}\right) \right] \), was then calculated according to (3b) by incorporating the average effect of the uncertainty of the inflow \(\dot{V}_{\textrm{in}}\sim p\left( \dot{V}_{\textrm{in}}\right) \). In Fig. 9d it can be seen that, as it should be expected, the additional consideration of uncertainty of the inflow made the posterior less expressive, such that an increase in the variance can be especially found along the dimension of the Young’s modulus.

The consideration of an uncertain boundary condition has also moved the point estimates. The MAP estimate for both inputs \({\varvec{x}}\) moves to lower values of \({\varvec{x}}_{\textrm{MAP}}=[E\approx 322\,\textrm{Pa},\; \nu \approx 0.09]^{\textsf{T}} \) than in the uniform prior example without uncertainties. The posterior mean is \({\varvec{x}}_{\textrm{PM}}= [E\approx 416\,\textrm{Pa},\; \nu \approx -0.09]^{\textsf{T}}\). This is expected as the mean of the assumed inflow distribution is lower than the value used for data generation and therefore lower stiffness leads to a better suiting deformation.

### Remark 8

(MAP estimation after marginalization) With the consideration of uncertain parameter \(\varvec{\theta }(=\dot{V}_{\textrm{in}}) \) the MAP estimate is more difficult to obtain than without uncertainties. MAP determination is more difficult because the extended posterior (3b) must first be integrated to the posterior under uncertainty. Although this integration can easily be evaluated using the particle representation (13), the maximum of the posterior under uncertainty cannot be found without another assumption. We chose a histogram approach to collect the SMC particles in squared bins with 30 intervals per input variable \({\varvec{x}}\) according to the illustration in Fig. 9d. The trick of picking the particle that scored the highest posterior in the last SMC iteration does not work for the posterior under uncertainty or marginal distributions, as their determination first needs an integration step.

The global Gaussian approximation is more isotrop when considering the uncertainty. This is represented as a Gaussian approximation that has more circular percentile lines. In the interpretation of the covariance of the posterior, this means that there is no prevalent direction in this posterior. Compared to the posterior for the fixed boundary condition, the posterior including the uncertainty \(q(E,\nu |Y_{\textrm{obs},C})={\mathbb {E}}_{\dot{V}_{\textrm{in}}}\left[ p\left( E,\nu |\dot{V}_{\textrm{in}},Y_{\textrm{obs},C}\right) \right] \) is less restrictive by means of necessary assumptions and therefore also less stiff in the results. This gives also more flexibility in the results as there is a broader range of parameters to explain the observed experimental results. Nevertheless, better knowledge about uncertainties \(p\left( \dot{V}_{\textrm{in}}\right) \) can greatly improve the calibration result.

In case non-controllable *aleatory* uncertainty is present, neglecting it will lead to an overconfident, wrong posterior as the analyst introduces a modeling error by neglecting these effects. Incorporating aleatory uncertainty in the probabilistic model will generally introduce more uncertainty to the posterior (the distribution widens) and might potentially even completely change the characteristics of the posterior distribution.

### Calibration of heterogeneous biofilm model under uncertain inflow boundary condition

In our last example we want to calibrate material properties of a heterogeneous biofilm FSI model as in [12], but here under an uncertain inflow rate boundary condition. Heterogeneity comes into play in such problems due to different age and/or different nutrient availability of different parts of the domain. Hence, it was a natural choice to also shed some light on a more demanding case involving more parameters. The three subdomains are depicted in Fig. 11 and lead to six input parameters, consisting of the Young’s modulus and Poisson’s ratio of each subdomain.

The parameters \({\varvec{x}}_{\textrm{gt}}=[E_1 = 500 \,\textrm{Pa},\; \nu _1 = 0.2,\;E_2 = 200 \,\textrm{Pa},\; \nu _2 = 0.1,\; E_3 = 1000 \,\textrm{Pa},\; \nu _3 = 0.3]^{\textsf{T}}\) for a hyperelastic Saint-Venant-Kirchhoff material model are used as the ground truth inputs, along with the inflow rate \(\dot{V}_{\textrm{in}}=100\,{\mathrm {mm^2}/\textrm{s}}\). All other model details are the same as for the previous examples, to then generate the synthetic experimental data \(Y_{\textrm{obs},C}\) for this heterogeneous case. Please note that in this example we additionally assume noise polluted measurement data according to

$$\begin{aligned} \begin{aligned} Y_{\textrm{obs},C}&= \mathfrak {M}\left( \varvec{x}_{\text {gt}},\varvec{\theta }_{\text {gt}},C\right) + \sigma _{\text {n,obs}}\cdot \varvec{\epsilon }\\ \text {with } \varvec{\epsilon }&\sim \mathcal {N}\left( \varvec{0},{I}\right) \end{aligned} \end{aligned}$$

(16)

For the following example we choose noise with a standard deviation of \(\sigma _{\text {n,obs}}=1\,\mathrm {\mu m}\). In remark 9 we comment on the determination of the GP surrogate for this example.

### Remark 9

(Convergence of the Gaussian process surrogate) As the GP surrogate model for the likelihood is now dependent on six input variables plus one uncertain variable, more training data is necessary to reach acceptable accuracy of the posterior mean function of the GP. A small convergence study was performed to find an appropriate training size. Therefore we successively increased \(n_\textrm{train}\) according to the Sobol sequence and then calculated the \(\text {L}_2\)-error norm between the posterior mean of the GP and the likelihood evaluated with the forward model at \(n_{\textrm{test}}=100\) testing points that are fixed consecutive samples from later in the Sobol sequence and unused in the training data. This is a representative choice as the good space filling property leads to test points which are well distributed. The result of the convergence study is plotted in Fig. 12. The tested scenario is a kernel with multiple length scales with a multiplicative coupling [39, 53] as used in the following example. The parameters are expected and also observed to have different influence on the likelihood. Therefore different length scales and variances in a multiplicative coupling of Matérn 2/3 kernels for each parameter dimension is used. As opposed to the L-BFGS-B optimizer used in previous examples, a scaled conjugate gradient optimizer is used for training for stability reasons.

Figure 12 shows that between \(n_\textrm{train}=1000\) and \(n_\textrm{train}=5000\) no significant improvement in the given error measure can be achieved. That is why the evaluation with \(n_\textrm{train}=2000\) data points is chosen for the following example as a compromise between efficiency and accuracy. In general asymptotic convergence can be expected, meaning that the error measure will go to zero for an infinite amount of training points. For \(n_\textrm{train}=500\) an increase in the error can be detected. Here a new characteristic in the likelihood function was introduced with the training points between \(n_\textrm{train}=200\) and \(n_\textrm{train}=500\), that lead to the increased deviation at the test points.

For the SMC approach a set of 10,000 particles with 30 rejuvenation steps was used with a step size control parameter of \(\zeta = 0.998\). Just as a comparison to the \(n_\textrm{train}=2000\) training points used, a full evaluation in every rejuvenation step of the SMC algorithm would require three million (\(10,000\cdot 30\cdot 10= 3,000,000\)) forward model evaluations for exemplary 10 SMC steps. This is neither desirable nor feasible if directly applied to the expensive forward model. After the proof of concept in previous examples, with two input parameters and potentially one uncertain parameter, it must be emphasized, that volume integrals for respective marginals and especially the normalization of the posterior (1) in six plus one dimensions scale even worse. Just for a moderately sized, grid-based discretization with 100 sample points in every dimension we would get \(100^7\) necessary evaluations of the GP-mean. So we make use of the good scalability of the SMC algorithm in higher dimensions in the following examples.

As already indicated above, in this last example we want to show the full capability of the approach and therefore we use an example with uncertain inflow rate \(\dot{V}_{\textrm{in}}\) and added noise to the data generated from the ground truth values. The assumed distribution on \(\dot{V}_{\textrm{in}}\) is the same as in Fig. 10 above with a beta-distribution \(\mathcal {B}\left( \dot{V}_{\textrm{in}}|a=2.6,b=1.4\right) \) between \(0 \,{\mathrm {mm^2}/\textrm{s}} \) and \( 110 \,{\mathrm {mm^2}/\textrm{s}} \) with its mode at \(88 \,{\mathrm {mm^2}/\textrm{s}}\). The MAP approximation is found at \({\varvec{x}}_{\textrm{MAP}}=[E_1 \approx 345 \,\textrm{Pa},\; \nu _1 \approx 0.43,\;E_2 \approx 135 \,\textrm{Pa},\; \nu _2 \approx 0.30,\; E_3 \approx 1034 \,\textrm{Pa},\; \nu _3 \approx -0.47]^{\textsf{T}}\) and the global posterior mean computes to \({\varvec{x}}_{\textrm{PM}}= [E_1 \approx 443 \,\textrm{Pa},\; \nu _1 \approx -0.15,\;E_2 \approx 431 \,\textrm{Pa},\; \nu _2 \approx -0.13,\; E_3 \approx 637 \,\textrm{Pa},\; \nu _3 \approx -0.14]^{\textsf{T}}\). Analogously to the lower dimensional example, the MAP estimate is obtained with a binning approach with 10 intervals in each input dimension. Therefore it is a rough estimate. Nevertheless, the curse of dimensionality inhibits an excessively fine grid. The maximum of the posterior under uncertainty qualitatively represents the ground truth with \(E_3>E_1>E_2 \) and \(\nu _1 > \nu _2\). Only for \(\nu _3\), it does not correspond to the ground truth. This can be a consequence of the relevance of the parameter to the overall posterior, as the lateral contraction in the stiffest subdomain, attached to the ground also has little effect on the interface deformation. In challenging examples it can be a good idea to first perform a global sensitivity analysis [54, 55] and then focus the inverse analysis only on the most sensitive parameters. It also shows again, that the problem setup is challenging as the volume inflow rate is used as an uncertain parameter. But this is also often the case in real-world applications. This has the effect that the MAP parameters are those of a less stiff biofilm material, than the ones used for the ground truth simulation.

The resulting one dimensional marginals over the parameters, for which uniform priors were assumed, are plotted in Fig. 13 as histograms with KDE approximations in black. Here the letter \(q({\varvec{x}}|Y_{\textrm{obs},C})\) is used because the distributions represent marginals of the posterior under uncertainty (3b).

It can be observed that most of these marginals show a rather uniform distribution. In this example, this happens as they represent the integration of the posterior with respect to all other parameters respectively and therefore the probability mass is accumulated herein. Only for subdomain \(\Omega _2\) (in Fig. 13c and d) the posterior marginals acummulate density around the ground truth \(E_2 \approx 200\,\textrm{Pa}\) and \(\nu _2 > 0\). As these single-dimensional marginals are not unveiling the full complexity of the posterior, a further step is made to show two-dimensional marginals for a selected combination of parameters in Fig. 14 as hexagonal histograms with the percentile lines of the KDE approximations. Therein the interaction effects between the respective combination of parameters can be seen. Especially within the regions enclosed by the percentile lines of \(10\%\) the most probable parameter combinations for the respective marginals can be found.

The marginal posterior distributions in Fig. 14 show complex densities. As compared to a deterministic calibration approach for the same example [12], where even the less complex case with a fixed volume inflow rate \(\dot{V}_{\textrm{in}}\) could not be solved, this complex character can be well represented here in the obtained solution. Furthermore, for Fig. 14a and c it can be seen that \(E_2 \) dominates the marginal posteriors that are high around \(E_2 \approx 200 \,\textrm{Pa}\) for a plausible range of the other Young’s moduli \(E_1,E_3\). Also the shape of the distribution for \(\Omega _2 \) in Fig. 14e resembles the posterior shape in Fig. 9d, which is the one for homogeneous example with uniform prior and uncertain inflow. This means that parameters in \(\Omega _2\) qualitatively have similar influence in this marginal, as the two parameters have in the homogeneous example in the posterior. This is also expected as \(\Omega _2 \) takes up the highest portion of the biofilm domain (see Fig. 11) and therefore dominates the deformation.

As we have six parameters involved, a full plot of the posterior is impossible. Alternatively, a parallel axis plot as in Fig. 15 can be used to get an idea of the underlying posterior shape.

Over the first six axis we see the input parameters for the range of interest. The seventh axis shows the extended posterior values for the 10,000 particles from the SMC particle approximation. The last axis represents the uncertain inflow boundary condition \(\dot{V}_{\textrm{in}}\). Every line connecting the axis represents one particle of the SMC solution and all lines are colored by their absolute extended posterior density value. The lines are drawn on top of each other, starting with low posterior density values in blue and ending with the highest ones in red. Most of the parameter combinations with high posterior density accumulate around \(E_2 \approx 200\,\textrm{Pa}\) and \(\nu _1, \nu _2 > 0\). The other three parameters \(E_1, E_3, \nu _3\) seem to score relatively high posterior values for the whole range of interest. That means they are individually less significant to a specific value of the posterior density. Given the subdomain topology, as seen in Fig. 11, these posterior results can also be expected as \(\Omega _2 \) takes up the largest portion of the biofilm and therefore can also be expected to have the highest influence on the deformation in this case.

The possibility for such a phenomenological interpretation of the posterior is a very attractive feature of tackling the inverse problem via the proposed approach, as not only good point estimates can be concluded, but more importantly the influence of the individual parameters on the posterior can be observed and interpreted. The information revealed in such a probablistic analysis, as for example displayed in Figs. 14 and 15, gives indeed a lot of insight into the problem at hand. Among others, it also shows how useful given experimental information is or whether other measurements are needed for the identification of relevant parameters. With the rich information from such an analysis also an estimation of the plausibility and stability of the point estimates can be made. As an example, a deterministic or trial and error approach might end up in identifying negative Poisson ratios for biofilms (easily understandable when looking at Figs. 9 or 15) and hence identifying biofilms as auxetic materials, as it has been done in the past. The probabilistic analysis would immediately show the lack of validity for such a conclusion.