 Research article
 Open access
 Published:
Accelerated construction of projectionbased reducedorder models via incremental approaches
Advanced Modeling and Simulation in Engineering Sciences volume 11, Article number: 8 (2024)
Abstract
We present an accelerated greedy strategy for training of projectionbased reducedorder models for parametric steady and unsteady partial differential equations. Our approach exploits hierarchical approximate proper orthogonal decomposition to speed up the construction of the empirical test space for leastsquare Petrov–Galerkin formulations, a progressive construction of the empirical quadrature rule based on a warm start of the nonnegative leastsquare algorithm, and a twofidelity sampling strategy to reduce the number of expensive greedy iterations. We illustrate the performance of our method for two test cases: a twodimensional compressible inviscid flow past a LS89 blade at moderate Mach number, and a threedimensional nonlinear mechanics problem to predict the longtime structural response of the standard section of a nuclear containment building under external loading.
Introduction
Projectionbased model reduction of parametric systems
In the past few decades, several studies have shown the potential of model order reduction (MOR) techniques to speed up the solution to manyquery and realtime problems, and ultimately enable the use of physicsbased threedimensional models for design and optimization, uncertainty quantification, realtime control and monitoring tasks. The distinctive feature of MOR methods is the offline/online computational decomposition: during the offline stage, highfidelity (HF) simulations are employed to generate an empirical reducedorder approximation of the solution field and a parametric reducedorder model (ROM); during the online stage, the ROM is solved to estimate the solution field and relevant quantities of interest for several parameter values. Projectionbased ROMs (PROMs) rely on the projection of the equations onto a suitable lowdimensional test space. Successful MOR techniques should hence achieve significant online speedups at acceptable offline training costs. This work addresses the reduction of offline training costs of PROMs for parametric steady and unsteady partial differential equations (PDEs).
We are interested in the solution to steady parametric conservation laws. We denote by \(\mu \) the vector of p model parameters in the compact parameter region \(\mathcal {P}\subset {\mathbb {R}}^p\); given the domain \(\Omega \subset {\mathbb {R}}^d\) (\(d=2\) or \(d=3\)), we introduce the Hilbert spaces \((\mathcal {X}, \Vert \cdot \Vert )\) and \((\mathcal {Y}, {\left \!\left \!\left {\cdot }\right \!\right \!\right })\) defined over \(\Omega \). Then, we consider problems of the form
where \(\mathfrak {R}:\mathcal {X} \times \mathcal {Y} \times \mathcal {P} \rightarrow {\mathbb {R}}\) is the parametric residual associated with the PDE of interest. We here focus on linear approximations, that is we consider reducedorder approximations of the form
where \(Z:{\mathbb {R}}^n \rightarrow \mathcal {X}\) is a linear operator, and n is much smaller than the size of the HF model; \(\widehat{\alpha }: \mathcal {P} \rightarrow {\mathbb {R}}^n\) is the vector of generalized coordinates. We here exploit the leastsquare PetrovGalerkin (LSPG, [1]) ROM formulation proposed in [2], which is welladapted to approximating advectiondominated problems: the approach relies on the definition of a lowdimensional empirical test space \(\widehat{\mathcal {Y}} \subset \mathcal {Y}\); furthermore, it relies on the approximation of the HF residual \(\mathfrak {R}\) through hyperreduction [3] to enable fast online calculations of the solution to the ROM. In more detail, we consider an empirical quadrature (EQ) procedure [4, 5] for hyperreduction: EQ methods recast the problem of finding a sparse quadrature rule to approximate \(\mathfrak {R}\) as a sparse representation problem and then resort to optimization algorithms to find an approximate solution. Following [4], we resort to the nonnegative leastsquare (NNLS) method to find the quadrature rule.
We further consider the application to unsteady problems: to ease the presentation, we consider onestep time discretizations based on the time grid \(\{ t^{(k)} \}_{k=0}^K\). Given \(\mu \in \mathcal {P}\), we seek the sequence such that
for all \(\mu \in \mathcal {P}\). As for the steadystate case, we consider linear ansatzs of the form \(\widehat{u}_{\mu }^{(k)} = Z \widehat{\alpha }_{\mu }^{(k)}\), where \(Z:{\mathbb {R}}^n \rightarrow \mathcal {X}\) is a linear time and parameterindependent operator, and \(\widehat{\alpha }^{(0)}, \ldots , \widehat{\alpha }^{(K)} : \mathcal {P} \rightarrow {\mathbb {R}}^n\) are obtained by projecting the Eq. (3) onto a lowdimensional test space. To speed up online costs, we also replace the HF residual in (3) with a rapidlycomputable surrogate through the same hyperreduction technique considered for steadystate problems.
Following the seminal work by Veroy [6], numerous authors have resorted to greedy methods to adaptively sample the parameter space, with the ultimate aim of reducing the number of HF simulations performed during the training phase. Algorithm 1 summarizes the general methodology. First, we initialize the reducedorder basis (ROB) Z and the ROM based on a priori sampling of the parameter space; second, we repeatedly solve the ROM and we estimate the error over a range of parameters \(\mathcal {P}_{\textrm{train}}\subset \mathcal {P}\); third, we compute the HF solution for the parameter that maximizes the error indicator; fourth, if the error is above a certain threshold, we update the ROB Z and the ROM, and we iterate; otherwise, we terminate. Note that the algorithm depends on an a posteriori error indicator \(\Delta \): if \(\Delta \) is a rigorous a posteriori error estimator, we might apply the termination criterion directly to the error indicator (and hence save one HF solve). The methodology has been extended to unsteady problems in [7]: the method in [7] combines a greedy search driven by an a posteriori error indicator with proper orthogonal decomposition (POD, [8, 9]) to compress the temporal trajectory.
As observed by several authors, greedy methods enable effective sampling of the parameter space [10]; however, they suffer from several limitations that might ultimately limit their effectiveness compared to standard a priori sampling. First, Algorithm 1 is inherently sequential and cannot hence benefit from parallel architectures. Second, Algorithm 1 requires the solution to the ROM and the evaluation of the error indicator for several parameter values at each iteration of the offline stage; similarly, it requires the update of the ROM—i.e., the trial and test bases, the reduced quadrature, and possibly the data structures employed to evaluate the error indicator. These observations motivate the development of more effective training strategies for MOR.
Contributions and relation to previous work
We propose an acceleration strategy for Algorithm 1 based on three ingredients: (i) a hierarchical construction of the empirical test space for LSPG ROMs based on the hierarchical approximate POD (HAPOD, [11]); (ii) a progressive construction of the empirical quadrature rule based on a warm start of the NNLS algorithm; (iii) a twofidelity sampling strategy that is based on the application of the stronggreedy algorithm (see, e.g., [12, section 7.3]) to a dataset of coarse simulations. Re (iii), sampling based on coarse simulations is employed to initialize Algorithm 1 (cf. Line 1) and ultimately reduce the number of expensive greedy iterations. We illustrate the performance of our method for two test cases: a twodimensional compressible inviscid flow past a LS89 blade at moderate Mach number, and a threedimensional nonlinear mechanics problem to predict the longtime structural response of the standard section of a nuclear containment building (NCB) under external loading.
Our method shares several features with previouslydeveloped techniques. Incremental POD techniques have been extensively applied to avoid the storage of the full snapshot set for unsteady simulations (see, e.g., [11] and the references therein): here, we adapt the incremental approach described in [11] to the construction of the test space for LSPG formulations. Chapman [13] extensively discussed the parallelization of the NNLS algorithm for MOR applications: we envision that our method can be easily combined with the method in [13] to further reduce the training costs. Several authors have also devised strategies to speed up the greedy search through the vehicle of a surrogate error model [14, 15]; our multifidelity strategy extends the work by Barral [16] to unsteady problems and to a more challenging—from the perspective of the HF solver—compressible flow test. We remark that our approach is similar in scope to the work by Benaceur [17] that devised a progressive empirical interpolation method (EIM, [18]) for hyperreduction of nonlinear problems. Finally, we observe that multifidelity techniques have been extensively considered for nonintrusive MOR (see, e.g., [19] and the references therein).
The paper is organized as follows. In “Projectionbased model reduction of parametric systems”, we review relevant elements of the construction of MOR techniques based on LSPG projection for steady conservation laws; we further address the construction of Galerkin ROMs for problems of the form (3). Then, in “Accelerated construction of PROMs”, we present the accelerated strategy for both classes of ROMs. “Numerical results” illustrates the performance of the method for two model problems. “Conclusions” draws some conclusions and discusses future developments.
Projectionbased model reduction of parametric systems
“Preliminary definitions and tools” summarizes notation that is employed throughout the paper and reviews three algorithms—POD, stronggreedy, and the active set method for NNLS problems—that are used afterwards. “Leastsquare Petrov–Galerkin formulation of steady” reviews the LSPG formulation employed in the present work for steady conservation laws and illustrates the strategies employed for the definition of the quadrature rule and the empirical test space. Finally, “Vanilla PODgreedy for Galerkin timemarching ROMs” addresses model reduction of unsteady systems of the form (3).
Preliminary definitions and tools
We denote by \(\mathcal {T}_{\textrm{hf}} = \left( \{ {x}_j^\textrm{hf}\}_{j=1}^{N_{\textrm{nd}}}, \texttt {T} \right) \) the HF mesh of the domain \(\Omega \) with nodes \(\{ {x}_j^{\textrm{hf}}\}_j\) and connectivity matrix \(\texttt {T}\); we introduce the elements \(\{ \texttt {D}_k \}_{k=1}^{N_{\textrm{e}}}\) and the facets \(\{ \texttt {F}_j \}_{j=1}^{N_{\textrm{f}}}\) of the mesh and we define the open set \(\widetilde{\texttt {F}}_j\) in \(\Omega \) as the union of the elements of the mesh that share the facet \(\texttt {F}_j\), with \(j=1,\ldots ,N_{\textrm{f}}\). We further denote by \(\mathcal {X}_{\textrm{hf}}\) the HF space associated with the mesh \(\mathcal {T}_{\textrm{hf}}\) and we define \(N_{\textrm{hf}}:=\textrm{dim} ( \mathcal {X}_{\textrm{hf}} )\).
Exploiting the previous definitions, we can introduce the HF residual:
for all \(w,v\in \mathcal {X}_{\textrm{hf}}\). To shorten notation, in the remainder, we do not explicitly include the restriction operators in the residual. The global residual can be viewed as the sum of local elementwise residuals and local facetwise residuals, which can be evaluated at a cost that is independent of the total number of elements and facets, and is based on local information. The nonlinear infinitedimensional statement (1) translates into the highdimensional problem: find \(u_{\mu }^\textrm{hf}\in \mathcal {X}_{\textrm{hf}} \) such that
Since (5) is nonlinear, the solution requires to solve a nonlinear system of \(N_{\textrm{hf}}\) equations with \(N_{\textrm{hf}}\) unknowns. Towards this end, we here resort to the pseudotransient continuation (PTC) strategy proposed in [20].
We use the method of snapshots (cf. [8]) to compute POD eigenvalues and eigenvectors. Given the snapshot set \(\{ u^k \}_{k=1}^{n_{\textrm{train}}} \subset \mathcal {X}_{\textrm{hf}}\) and the inner product \((\cdot , \cdot )_{\textrm{pod}}\), we define the Gramian matrix \(\textbf{C} \in {\mathbb {R}}^{n_{\textrm{train}} \times n_{\textrm{train}}}\), \(\textbf{C}_{k,k'}= (u^k, u^{k'})_{\textrm{pod}}\), and we define the POD eigenpairs \(\{(\lambda _i, \zeta _i) \}_{i=1}^{n_{\textrm{train}}}\) as
with \(\lambda _1 \ge \lambda _2 \ge \ldots \lambda _{n_{\textrm{train}}} \ge 0\). In our implementation, we orthonormalize the modes, that is \(( \zeta _n, \zeta _n)_{\textrm{pod}}= 1\) for \(n=1,\ldots ,n_{\textrm{train}}\). In the remainder we use notation
to refer to the application of POD to the snapshot set \(\{ u^k \}_{k=1}^{n_{\textrm{train}}}\). The number of modes n can be chosen adaptively by ensuring that the retained energy content is above a certain threshold (see, e.g., [12, Eq. (6.12)]).
We further recall the stronggreedy algorithm: the algorithm takes as input the snapshot set \(\{ u^k \}_{k=1}^{n_{\textrm{train}}}\), an integer n, an inner product \((\cdot , \cdot )_{\textrm{sg}}\) and the induced norm \(\Vert \cdot \Vert _{\textrm{sg}} = \sqrt{(\cdot , \cdot )_{\textrm{sg}}}\), and returns a set of n indices \(\mathcal {I}_{\textrm{sg}}\subset \{1,\ldots ,n_{\textrm{train}}\}\)
The dimension n can be chosen adaptively by ensuring that the projection error is below a given threshold \(\textrm{tol}_{\textrm{sg}}>0\),
where \(\mathcal {Z}_{n'}\) denotes the \(n'\)dimensional space obtained after \(n'\) steps of the greedy procedure in Algorithm 2, \(\mathcal {Z}_{n'}^\perp \) is the orthogonal complement of the space \(\mathcal {Z}_{n'}\) and \(\Pi _{ \mathcal {Z}_{n'}^\perp }: \mathcal {X} \rightarrow \mathcal {Z}_{n'}^\perp \) is the orthogonal projection operator onto \(\mathcal {Z}_{n'}^\perp \).
We conclude this section by reviewing the active set method [21] that is employed to find a sparse solution to the nonnegative least square problem:
for a given matrix \(\textbf{G}\in {\mathbb {R}}^{M\times N}\) and a vector \(\textbf{b}\in {\mathbb {R}}^M\). Algorithm 3 reviews the computational procedure. In the remainder, we use notation
to refer to the application of Algorithm 3.
Note that the method takes as input a set of indices—which is initialized with the empty set in the absence of prior information—to initialize the process. Given the matrix \(\textbf{G}= [\textbf{g}_1,\ldots ,\textbf{g}_N]\), the vector \(\textbf{x}\in {\mathbb {R}}^N\), and the set of indices \(P=\{ p_i \}_{i=1}^m \subset \{1,\ldots ,N\}\), we use notation \(\textbf{G}(:,P):=[\textbf{g}_{p_1},\ldots ,\textbf{g}_{p_m} ]\in {\mathbb {R}}^{M\times m}\) and \(\textbf{x}(P) = \textrm{vec} ( (\textbf{x})_{p_1},\ldots , (\textbf{x})_{p_m} ) \in {\mathbb {R}}^m\); we denote by \(\# P \) the cardinality of the discrete set P, and we introduce the complement of P in \(\{1,\ldots ,N \} \) as \(P^\textrm{c}=\{1,\ldots ,N \} \setminus P\). Given the vector \( \textbf{x} \in {\mathbb {R}}^N\) and the set of indices \(\mathcal {I} \subset \{1,\ldots ,N\}\), notation \(\left[ \alpha , {i}^\star \right] = \min _{i\in \mathcal {I} } \left( \textbf{x} \right) _i\) signifies that \(\alpha = \min _{i\in \mathcal {I} } \left( \textbf{x} \right) _i\) and \( {i}^\star \in \mathcal {I}\) realizes the minimum, \(\alpha =\left( \textbf{x} \right) _{ {i}^\star }\). The constant \(\epsilon >0\) is intended to avoid division by zero and is set to \(2^{1022}\). The computational cost of Algorithm 3 is dominated by the cost to repeatedly solve the leastsquare problem at Line 11: in “Numerical results”, we hence report the total number of leastsquare solves needed to achieve convergence (cf. output it in Algorithm 3).
Leastsquare Petrov–Galerkin formulation of steady problems
Given the reducedorder basis (ROB) \(Z=[\zeta _1,\ldots ,\zeta _n]:{\mathbb {R}}^n \rightarrow \mathcal {X}_{\textrm{hf}}\), following [16], we consider the LSPG formulation
where \( \widehat{\mathcal {Y}} = \textrm{span} \{ \psi _i \}_{i=1}^m\) is a suitable test space that is chosen below and the empirical residual \(\mathfrak {R}_{\mu }^{\textrm{eq}}\) satisfies
where \(\varvec{\rho }^{\textrm{eq,e}} \in {\mathbb {R}}_+^{N_{\textrm{e}}}\) and \(\varvec{\rho }^{\textrm{eq,f}} \in {\mathbb {R}}_+^{N_{\textrm{f}}}\) are sparse vectors of nonnegative weights.
Provided that \( \{ \psi _i \}_{i=1}^m\) is an orthonormal basis of \(\widehat{\mathcal {Y}}\), we can rewrite (11a) as
which can be efficiently solved using the GaussNewton method (GNM). Note that (12) does not explicitly depend on the choice of the test norm \({\left \!\left \!\left {\cdot }\right \!\right \!\right }\): dependence on the norm is implicit in the choice of \( \{ \psi _i \}_{i=1}^m\). Formulation (11)–(12) depends on the choice of the test space \(\widehat{\mathcal {Y}}\) and the empirical weights \(\varvec{\rho }^{\textrm{eq,e}}, \varvec{\rho }^{\textrm{eq,f}}\): in the remainder of this section, we address the construction of these ingredients.
We remark that Carlberg [1] considered a different projection method, which is based on the minimization of the Euclidean norm of the discrete residual: due to the particular choice of the test norm, the approach of [1] does not require the explicit construction of the empirical test space. We further observe that Yano and collaborators [22, 23] have considered different formulations of the empirical residual (11b). A thorough comparison between different projection methods and different hyperreduction techniques is beyond the scope of the present study.
Construction of the empirical test space
As discussed in [2], the test space \(\widehat{\mathcal {Y}}\) should approximate the Riesz representers of the functionals associated with the action of the Jacobian on the elements of the trial ROB. Given the snapshot set of HF solutions \(\{ u_{\mu }^{\textrm{hf}} : \mu \in \mathcal {P}_{\textrm{train}}\}\) with \(\mathcal {P}_{\textrm{train}}=\{\mu ^k \}_{k=1}^{n} \subset \mathcal {P}\) and the ROB \(\{ \zeta _i \}_i\), we hence apply POD to the test snapshot set \(\mathcal {S}_n^{\textrm{test}} := \left\{ {\Psi }_{k,i}: k=1,\ldots ,n_{\textrm{train}} ,i=1,\ldots ,n \right\} \), where
where \(\mathfrak {J}_{\mu }^{\textrm{hf}} [w]: \mathcal {X}_{\textrm{hf}}\times \mathcal {X}_{\textrm{hf}} \rightarrow {\mathbb {R}}\) denotes the Fréchet derivative of the HF residual at w. It is also useful to provide an approximate computation of the righthand side of (13),
with \(\epsilon  \ll 1\). The evaluation of the righthand side of (14) involves the computation of the residual at \({u}_{\mu ^k}^{\textrm{hf}} + \epsilon \zeta _i\) for \(i=1,\ldots ,n\); on the other hand, the evaluation of (13) requires the computation of the Jacobian matrix at \({u}_{\mu ^k}^{\textrm{hf}}\) and the postmultiplication by the algebraic counterpart of Z. Both (13) and (14)—which are equivalent in the limit \(\epsilon \rightarrow 0\)—are used in the incremental approach of “Progressive construction of empirical test space and quadrature”.
Construction of the empirical quadrature rule
Following [5], we seek \(\varvec{\rho }^{\textrm{eq,e}} \in {\mathbb {R}}_+^{N_{\textrm{e}}}\) and \(\varvec{\rho }^{\textrm{eq,f}} \in {\mathbb {R}}_+^{N_{\textrm{f}}}\) in (11b) such that

(i)
(efficiency constraint) the number of nonzero entries in \(\varvec{\rho }^{\textrm{eq,e}},\varvec{\rho }^{\textrm{eq,f}}\), \( \texttt {nnz} ( \varvec{\rho }^{\textrm{eq,e}})\) and \(\texttt {nnz} ( \varvec{\rho }^{\textrm{eq,f}} )\), is as small as possible;

(ii)
(constant function constraint) the constant function is approximated correctly in \(\Omega \)
$$\begin{aligned} \Big  \sum _{k=1}^{N_{\textrm{e}}} \rho _k^{\textrm{eq,e}}  \texttt {D}_k  \,\,  \Omega  \Big  \ll 1, \quad \Big  \sum _{j=1}^{N_{\textrm{f}}} \rho _j^{\textrm{eq,f}}  \texttt {F}_j  \,\, \sum _{j=1}^{N_{\textrm{f}}}  \texttt {F}_j  \Big  \ll 1; \end{aligned}$$(15) 
(iii)
(manifold accuracy constraint) for all \(\mu \in \mathcal {P}_{\mathrm{\mathrm train},\textrm{eq}} = \{ \mu ^k \}_{k=1}^{n_{\textrm{train}}+n_{\textrm{train},\textrm{eq}}}\), the empirical residual satisfies
$$\begin{aligned} \Big \Vert {\varvec{\mathfrak {R}}}_{\mu }^{\textrm{hf}} ( {\alpha }_{\mu }^{\textrm{train}} ) \,  \, {\varvec{\mathfrak {R}}}_{\mu }^{\textrm{eq}} ( {\alpha }_{\mu }^\textrm{train} ) \Big \Vert _2 \ll 1, \end{aligned}$$(16a)where \({\varvec{\mathfrak {R}}}_{\mu }^{\textrm{hf}}\) corresponds to substitute \(\rho _1^{\textrm{eq,e}} = \ldots = \rho _{N_{\textrm{e}}}^{\textrm{eq,e}} = \rho _1^{\textrm{eq,f}} = \ldots = \rho _{N_{\textrm{f}}}^{\textrm{eq,f}} = 1\) in (11b) and \({\alpha }_{\mu }^{\textrm{train}}\) satisfies
$$\begin{aligned} {\alpha }_{\mu }^{\textrm{train}} = \left\{ \begin{array}{ll} \displaystyle { \textrm{arg} \min _{ {\alpha } \in {\mathbb {R}}^n} \; \Vert Z {\alpha }  {u}_{\mu }^{\textrm{hf}} \Vert ,} &{} \textrm{if} \; \mu \in \mathcal {P}_{\textrm{train}} ; \\ \displaystyle { \textrm{arg} \min _{{\alpha } \in {\mathbb {R}}^n} \; \Vert {\varvec{\mathfrak {R}}}_{\mu }^{\textrm{hf}} ( {\alpha } ) \Vert _2,} &{} \textrm{if} \; \mu \notin \mathcal {P}_{\textrm{train}} ; \\ \end{array}\right. \end{aligned}$$(16b)and \(\mathcal {P}_{\textrm{train}} = \{ \mu ^k \}_{k=1}^{n_{\textrm{train}}}\) is the set of parameters for which the HF solution is available. In the remainder, we use \(\mathcal {P}_{\textrm{train},\textrm{eq}} = \mathcal {P}_{\textrm{train}}\).
By tedious but straightforward calculations, we find that
for some row vectors \(\textbf{G}_{k,i}^{\textrm{e}} \in {\mathbb {R}}^{1\times N_{\textrm{e}}}\) and \(\textbf{G}_{k,i}^{\textrm{f}} \in {\mathbb {R}}^{1\times N_{\textrm{f}}}\), \(k=1,\ldots ,n_{\textrm{train}}\) and \(i=1,\ldots ,m\). Therefore, if we define
we find that the constraints (15) and (16) can be expressed in the algebraic form
and \(\varvec{\rho }^{\textrm{hf}} = [1,\ldots ,1]^\top \).
In conclusion, the problem of finding the sparse weights \(\varvec{\rho }^{\textrm{eq,e}}, \varvec{\rho }^{\textrm{eq,f}}\) can be recast as a sparse representation problem
where \(\texttt {nnz} (\varvec{\rho } )\) is the number of nonzero entries in the vector \(\varvec{\rho }\), for some userdefined tolerance \(\delta >0\), and \(\textbf{b} = \textbf{G} \varvec{\rho }^{\textrm{hf}}\). Following [4], we resort to the NNLS algorithm discussed in “Preliminary definitions and tools” (cf. Algorithm 3) to find an approximate solution to (18).
A posteriori error indicator
The final element of the formulation is the a posteriori error indicator that is employed for the parameter exploration. We here consider the residualbased error indicator (cf. [16]),
Note that the evaluation of (19) requires the solution to a symmetric positive definite linear system of size \(N_{\textrm{hf}}\): it is hence illsuited for realtime online computations; nevertheless, in our experience the offline cost associated with the evaluation of (19) is comparable with the cost that is needed to solve the ROM—clearly, this is related to the size of the mesh and is hence strongly problemdependent.
Overview of the computational procedure
Algorithm 4 provides a detailed summary of the construction of the ROM at each step of Algorithm 1 (cf. Line 7). Some comments are in order. The cost of Algorithm 4 is dominated by the assembly of the test snapshot set and by the solution to the NNLS problem. We also notice that the storage of \(\mathcal {S}_n^\textrm{test}\) scales with \(\mathcal {O}(n^2)\) and is hence the dominant memory cost of the offline procedure.
Vanilla PODgreedy for Galerkin timemarching ROMs
We denote by \(u_{\mu }\in {\mathbb {R}}^{d}\) the displacement field, by \(\sigma _{\mu }\in {\mathbb {R}}^{d\times d}\) the Cauchy tensor, by \(\varepsilon _{\mu } = \nabla _{\textrm{s}} u_{\mu }\) the strain tensor with \(\nabla _{\textrm{s}} \bullet = \frac{1}{2} (\nabla \bullet + \nabla \bullet ^\top )\); we further introduce the vector of internal variables \(\gamma _{\mu }\in {\mathbb {R}}^{d_{\textrm{int}}}\). Then, we introduce the quasistatic equilibrium equations (completed with suitable boundary and initial conditions)
where \(\mathcal {F}_{\mu }^{\sigma }: {\mathbb {R}}^{d\times d} \times {\mathbb {R}}^{d_{\textrm{int}}} \rightarrow {\mathbb {R}}^{d\times d}\) and \(\mathcal {F}_{\mu }^{\gamma }: {\mathbb {R}}^{d\times d} \times {\mathbb {R}}^{d_{\textrm{int}}} \rightarrow {\mathbb {R}}^{d_{\textrm{int}}}\) are suitable parametric functions that encode the material constitutive law—note that the Newton’s law (20)\(_1\) does not include the inertial term; the temporal evolution is hence entirely driven by the constitutive law (20)\(_3\). Equation (20) is discretized using the FE method in space and a onestep finite difference (FD) method in time. Given the parameter \(\mu \in \mathcal {P}\), the FE space \(\mathcal {X}_{\textrm{hf}}\) and the time grid \(\{ t^{(k)} \}_{k=1}^K\), we seek the sequence such that
where the elemental residuals satisfy
\(\mathcal {I}_{\textrm{bnd}}\subset \{1,\ldots ,N_{\textrm{f}}\}\) are the indices of the boundary facets and the facet residuals \(\{ r_{j, \mu }^{(k), \mathrm f} \}_{j\in \mathcal {I}_{\textrm{bnd}}}\) incorporate boundary conditions. Note that \(\mathcal {F}_{\mu ,\Delta t}^{\gamma }\) is the FD approximation of the constitutive law (20)\(_3\).
Given the reduced space \(\mathcal {Z}=\textrm{span} \{ \zeta _i \}_{i=1}^n \subset \mathcal {X}_{\textrm{hf}}\), the timemarching hyperreduced Galerkin ROM of (21)–(22) reads as: given \(\mu \in \mathcal {P}\), find such that
for \(k=1,\ldots ,K\), where \(\varvec{\rho }^{\textrm{eq,e}} \in {\mathbb {R}}_+^{N_{\textrm{e}}}\) and \(\varvec{\rho }^{\textrm{eq,f}} \in {\mathbb {R}}_+^{N_{\textrm{f}}}\) are suitablychosen sparse vectors of weights. We observe that the Galerkin ROM (23) does not reduce the total number of time steps: solution to (23) hence requires the solution to a sequence of K nonlinear problems of size n and is thus likely much more expensive than the solution to (Petrov) Galerkin ROMs for steadystate problems. Furthermore, several independent studies have shown that residualbased a posteriori error estimators for (23) are typically much less sharp and reliable than their counterparts for steadystate problems. These observations have motivated the development of spacetime formulations [24]: to our knowledge, however, spacetime methods have not been extended to problems with internal variables.
The abstract Algorithm 1 can be readily extended to timemarching ROMs for unsteady PDEs: the PODGreedy method [7] combines a greedy search in the parameter domain with a temporal compression based on POD. At each iteration it of the algorithm, we update the reduced space \(\mathcal {Z}\) using the newlycomputed trajectory \(\{ u_{\mu ^{\star , it}}^{(k)} \}_{k=1}^K\) and we update the quadrature rule. The reduced space \(\mathcal {Z}\) should be updated using hierarchical methods that avoid the storage of the full spacetime trajectory for all sampled parameters: the hierarchical approximate POD (HAPOD, see [11] and also “Empirical test space”) or the hierarchical approach in [25, section 3.5] that generates nested spaces: we refer to [26, section 3.2.1] for further details and extensive numerical investigations for a model problem with internal variables. Here, we consider the nested approach of [25]: we denote by \(n_{it}\) the dimension of the reduced space \(\mathcal {Z}_{it}\) at the itth iteration, and we denote by \(n_{\textrm{new}} = n_{it}  n_{it1}\) the number of modes added at each iteration,
where \(n_{\textrm{new}} (tol)\) satisfies \(n_{\textrm{new}}(tol) = \)
for some tolerance \(tol>0\), with \(\Delta t^{(k)} = t^{(k)}  t^{(k1)} \), \(k=1,\ldots ,K\). The quadrature rule is obtained using the same procedure described in “Empirical quadrature”: exploiting (24), it is easy to verify that the matrix \(\textbf{G}\) (cf. (17c)) can be rewritten as (we omit the details)
Note that \(\textbf{G}_{\textrm{acc}}^{it}\) corresponds to the columns of the matrix \(\textbf{G}\) associated with the manifold accuracy constraints (cf. (16)) at the itth iteration of the greedy procedure.
Accelerated construction of PROMs
“Progressive construction of empirical test space and quadrature” discusses the incremental strategies proposed in this work, to reduce the costs associated with the construction of the empirical test space and the empirical quadrature in Algorithm 4; “Multifidelity sampling” summarizes the multifidelity sampling strategy, which is designed to reduce the total number of greedy iterations required by Algorithm 1. “Progressive construction of empirical test space and quadrature” and “Multifidelity sampling” focus on LSPG ROMs for steady problems; in “Extension to unsteady problems” we discuss the extension to Galerkin ROMs for unsteady problems.
Progressive construction of empirical test space and quadrature
Empirical test space
By reviewing Algorithms 1 and 4, we notice that the test snapshot set \(\mathcal {S}_n^{\textrm{test}}\) satisfies
therefore, at each iteration, it suffices to solve \(2n1\)—as opposed to \(n^2\)—Riesz problems of the form (13) to define \(\mathcal {S}_n^{\textrm{test}}\). As in [16], we rely on Cholesky factorization with fillin reducing permutations of rows and columns, to reduce the cost of solving the Riesz problems. In the numerical experiments, we rely on (13), which involves the assembly of the Jacobian matrix, to compute \(\left\{ \Psi _{n,1} \right\} _{k=1}^n\), while we consider the finite difference approximation (14) to compute \( \left\{ \Psi _{k,n} \right\} _{k=1}^{n1}\) with \(\epsilon =10^{6}\).
In order to lower the memory costs associated with the storage of the test snapshot set \(\mathcal {S}_n^{\textrm{test}}\) and also the cost of performing POD, we consider a hierarchical approach to construct the test space \(\widehat{\mathcal {Y}}\). In this work, we apply the (distributed) hierarchical approximate proper orthogonal decomposition: HAPOD approach is related to incremental singular value decomposition [27] and guarantees nearoptimal performance with respect to the standard POD (cf. [11]). Given the POD space \(\{ \psi _i \}_{i=1}^m\) such that \((( \psi _i , \psi _j )) = \delta _{i,j}\) and the corresponding eigenvalues \(\{ \lambda _i \}_{i=1}^m\), and the new set of snapshots \(\left\{ \Psi _{k,n} \right\} _{k=1}^{n1} \cup \left\{ \Psi _{n,i} \right\} _{i=1}^n\), HAPOD corresponds to applying POD to a suitable snapshot set that combines information from current and previous iterations,
with \(\mathcal {S}_n^{\textrm{incr}}:= \{ \sqrt{\lambda _i} \psi _i \}_{i=1}^{m} \cup \left\{ \Psi _{k,n} \right\} _{k=1}^{n1} \cup \left\{ \Psi _{n,i} \right\} _{i=1}^n.\) Note that the modes \(\{ \psi _i \}_{i=1}^{m} \) are scaled to properly take into account the energy content of the modes in the subsequent iterations. Note also that the storage cost of the method scales with \(m + 2n1\), which is much lower than \(n^2\), provided that \(m\ll n^2\). Note also that the POD spaces \(\{ \widehat{\mathcal {Y}}_n \}_n\), which are generated at each iteration of Algorithm 1 using (27), are not nested.
Empirical quadrature
Exploiting (17), it is straightforward to verify that if the test spaces are nested — that is, \(\widehat{\mathcal {Y}}_{n1} \subset \widehat{\mathcal {Y}}_n\), then the EQ matrix \(\textbf{G}_n\) at iteration n satisfies
where \(\textbf{G}_n^{\textrm{new}}\) has \(k = n \times \textrm{dim}( \widehat{\mathcal {Y}}_n \setminus \widehat{\mathcal {Y}}_{n1} ) + \textrm{dim}( \widehat{\mathcal {Y}}_{n1} )\) rows. We hence observe that we can reduce the cost of assembling the EQ matrix \(\textbf{G}_n\) by exploiting (28), provided that we rely on nested test spaces: since, in our experience, the cost of assembling \(\textbf{G}_n \) is negligible, the use of nonnested test spaces does not hinder offline performance.
On the other hand, due to the strong link between consecutive EQ problems that are solved at each iteration of the greedy procedure, we propose to initialize the NNLS Algorithm 3 using the solution from the previous time step, that is
We provide extensive numerical investigations to assess the effectiveness of this choice.
Summary of the incremental weakgreedy algorithm
Algorithm 5 reviews the full incremental generation of the ROM. In the numerical experiments, we set \(m=2n\): this implies that the storage of \(\mathcal {S}^\textrm{incr}\) scales with \(4 n  3\) as opposed to \(n^2\).
Multifidelity sampling
The incremental strategies of “Progressive construction of empirical test space and quadrature” do not affect the total number of greedy iterations required by Algorithm 1 to achieve the desired accuracy; furthermore, they do not address the reduction of the cost of the HF solves. Following [16], we here propose to resort to coarser simulations to learn the initial training set \(\mathcal {P}_{\star }\) (cf. Line 1 Algorithm 1) and to initialize the HF solver. Algorithm 6 summarizes the computational procedure.
Some comments are in order.

In the numerical experiments, we choose the cardinality \(n_0\) of \(\mathcal {P}_\star \) according to (8) with \(\textrm{tol}_{\textrm{sg}} =\texttt {tol}\) (cf. Algorithm 1). Note that, since the strong greedy algorithm is applied to the generalized coordinates of the coarse ROM, \(n_0\) cannot exceed the size n of the ROM.

We observe that increasing the cardinality of \(\mathcal {P}_\star \) ultimately leads to a reduction of the number of sequential greedy iterations; it hence enables much more effective parallelization of the offline stage. Note also that Algorithm 6 can be coupled with parametric mesh adaptation tools to build an effective problemaware mesh \(\mathcal {T}_{\textrm{hf}}\) (cf. [16]); in this work, we do not exploit this feature of the method.

We observe that the multifidelity Algorithm 6 critically depends on the choice of the coarse grid \(\mathcal {T}_{\textrm{hf}}^0\): an excessively coarse mesh \(\mathcal {T}_{\textrm{hf}}^0\) might undermine the quality of the initial sample \(\mathcal {P}_\star \) and of the initial condition for the HF solve, while an excessively fine mesh \(\mathcal {T}_{\textrm{hf}}^0\) reduces the computational gain. In the numerical experiments, we provide extensive investigations of the influence of the coarse mesh on performance.
Extension to unsteady problems
Acceleration of the PODgreedy algorithm for the Galerkin ROM (23) relies on two independent building blocks: first, the incremental construction of the quadrature rule; second, a twofidelity sampling strategy. Re the quadrature procedure, we notice that at each greedy iteration, the matrix \(\textbf{G}\) in (25) admits the decomposition in (28). If the number of new modes is modest compared to \(n_{it1}\) the rows of \(\textbf{G}_{\textrm{new}}^{(it)}\) are significantly less numerous than the columns of \(\textbf{G}^{(it)}\): we can hence reduce the cost of the construction of \(\textbf{G}^{(it)}\) by keeping in memory the EQ matrix from the previous iteration; furthermore, the decomposition (28) motivates the use of the active set of weights from the previous iterations to initialize the NNLS algorithm at the current iteration. On the other hand, the extension of the twofidelity sampling strategy in Algorithm 6 simply relies on the generalization of the stronggreedy procedure 2 to unsteady problems, which is illustrated in the algorithm below.
Algorithm 7 takes as input a set of trajectories and returns the indices of the selected parameters. To shorten the notation, we here assume that the time grid is the same for all parameters: however, the algorithm can be readily extended to cope with parameterdependent temporal discretizations. We further observe that the procedure depends on the data compression strategy employed to update the reduced space at each greedy iteration: in this work, we consider the same hierarchical strategy (cf. [25, section 3.5]) employed in the POD(weak)greedy algorithm.
Numerical results
We present numerical results for a steady compressible inviscid flow past a LS89 blade (cf. “Transonic compressible flow past an LS89 blade”), and for an unsteady nonlinear mechanics problem that simulates the longtime mechanical response of the standard section of a containment building under external loading (cf. “Longtime mechanical response of the standard section of a containment building under external loading”). Simulations of . “Transonic compressible flow past an LS89 blade” are performed in Matlab 2022a [28] based on an inhouse code, and executed over a commodity Linux workstation (RAM 32 GB, Intel i7 CPU 3.20 GHz x 12). The HF simulations of “Longtime mechanical response of the standard section of a containment building under external loading” are performed using the FE software code_aster [29] and executed over a commodity Linux workstation (RAM 32 GB, Intel i79850H CPU 2.60 GHz x 12); on the other hand, the MOR procedure relies on an inhouse Python code and is executed on a Windows workstation (RAM 16 GB, Intel i79750H CPU 2.60 GHz x 12).
Transonic compressible flow past an LS89 blade
Model problem
We consider the problem of estimating the solution to the twodimensional Euler equations past an array of LS89 turbine blades; the same model problem is considered in [30], for a different parameter range. We consider the computational domain depicted in Fig. 1a; we prescribe total temperature, total pressure and flow direction at the inflow, static pressure at the outflow, nonpenetration (wall) condition on the blade and periodic boundary conditions on the lower and upper boundaries. We study the sensitivity of the solution with respect to two parameters: the freestream Mach number \(\textrm{Ma}_{\infty }\) and the height of the channel H, \(\mu =[H,\textrm{Ma}_{\infty }]\). We consider the parameter domain \(\mathcal {P}=[0.9,1.1]\times [0.2,0.9]\).
We deal with geometry variations through a piecewisesmooth mapping associated with the partition in Fig. 1a. We set \(H_{\textrm{ref}}=1\) and we define the curve \(x_1\mapsto f_{\textrm{btm}}(x_1)\) that describes the lower boundary \(\Gamma _{\textrm{btm}}\) of the domain \(\Omega =\Omega (H=1)\); then, we define \(\widetilde{H}>0\) such that \(x_1\mapsto f_{\textrm{btm}}(x_1)+\widetilde{H}\) and \(x_1\mapsto f_{\textrm{btm}}(x_1)+H\widetilde{H}\) do not intersect the blade for any \(H\in [0.9,1.1]\); finally, we define the geometric mapping
where
with \(o_1(x_1)=f_{\textrm{btm}}(x_1)+\widetilde{H}\), \(o_2(x_1)=f_{\textrm{btm}}(x_1)+H_{\textrm{ref}}  \widetilde{H}\) and \(C(H) = \frac{HH_{\textrm{ref}}}{2\widetilde{H}}+1\). Fig. 1b, c show the distribution of the Mach field for \(\mu ^{(1)}=[0.95,0.78]\) and \(\mu ^{(2)}=[1.05,0.88]\). We notice that for large values of the freestream Mach number the solution develops a normal shock on the upper side of the blade and two shocks at the trailing edge: effective approximation for higher values of the Mach number requires the use of nonlinear approximations and is beyond the scope of the present work.
Results
We consider a hierarchy of six P2 meshes with \(N_{\textrm{e}} = 1827, 2591, 3304, 4249, 7467, 16353\) elements, respectively; Fig. 2a–c show three computational meshes. Figure 2d show maximum and mean errors between the HF solution \(u_\mu ^{\textrm{hf}, (6)}\) and the corresponding HF solution associated with the ith mesh,
over five randomlychosen parameters in \(\mathcal {P}\) . We remark that the error is measured in the reference configuration \(\Omega \), which corresponds to \(H=1\). We consider the \(L^2(\Omega )\) norm \(\Vert \cdot \Vert = \sqrt{\int _{\Omega } (\cdot )^2 \, dx }\).
Figure 3 compares the performance of the standard (“std”) greedy method with the performance of the incremental (“incr”) procedures of “Progressive construction of empirical test space and quadrature” for the coarse mesh (mesh 1). In all the tests below, we consider a training space \(\mathcal {P}_{\textrm{train}}\) based on a ten by ten equispaced discretization of the parameter domain. Figure 3a shows the number of iterations of the NNLS algorithm 3, while Fig. 3b shows the wallclock cost (in seconds) on a commodity laptop, and Fig. 3c shows the percentage of sampled weights \(\frac{\texttt {nnz}(\varvec{\rho }^{\textrm{eq}})}{N_{\textrm{e}} +N_{\textrm{f}}} \times 100 \%\). We observe that the proposed initialization of the active set method leads to a significant reduction of the total number of iterations, and to a nonnegligible reduction of the total wallclock cost^{Footnote 1}, without affecting the performance of the method. Figure 3d shows the cost of constructing the test space: we notice that the progressive construction of the test space enables a significant reduction of offline costs. Finally, Fig. 3e shows the averaged outofsample performance of the LSPG ROM over \(n_{\textrm{test}}=20\) randomlychosen parameters \(\mathcal {P}_{\textrm{test}}\),
and Fig. 3f shows the online costs: we observe that the standard and the incremental approaches lead to nearlyequivalent results for all values of the ROB size n considered.
Figure 4 replicates the tests of Fig. 3 for the fine mesh (mesh 6). As for the coarser grid, we find that the progressive construction of the test space and of the quadrature rule do not hinder online performance of the LSPG ROM and ensure significant offline savings. We notice that the relative error over the test set is significantly larger: reduction of the mesh size leads to more accurate approximations of sharp features—shocks, wakes—that are troublesome for linear approximations. On the other hand, we notice that online costs are nearly the same as for the coarse mesh for the corresponding value of the ROB size n: this proves the effectiveness of the hyperreduction procedure.
Figure 5 investigates the effectiveness of the sampling strategy based on the stronggreedy algorithm. We rely on Algorithm 6 to identify the training set of parameters \(\mathcal {P}_\star \) for different choices of the coarse mesh \(\mathcal {T}_{\textrm{hf}}^0=\mathcal {T}_{\textrm{hf}}^{(i)}\), for \(i=1,\ldots ,6\) and for \(\mathcal {T}_{\textrm{hf}} =\mathcal {T}_{\textrm{hf}}^{(6)}\). Then, we measure performance in terms of the maximum relative projection error over \(\mathcal {P}_{\textrm{train}}\),
where \(\mathcal {P}_{\star ,n'}^{(i)}\) is the set of the first \(n'\) parameters selected through Algorithm 6 based on the coarse mesh \(\mathcal {T}_{\textrm{hf}}^{(i)}\). Figure 5a shows the behavior of the projection error \(E_{n}^{\textrm{proj},(i)}\) (32) for three different choices of the coarse mesh; to provide a concrete reference, we also report the performance of twenty sequences of reduced spaces obtained by randomly selecting sequences of parameters in \(\mathcal {P}_{\textrm{train}}\). Figure 5b, c show the parameters selected through Algorithm 6 for two different choices of the coarse mesh: we observe that the selected parameters are clustered in the proximity of \(\textrm{Ma}_{\infty }=0.9\).
Table 1 compares the costs of the standard weakgreedy algorithm (“vanilla”), the weakgreedy algorithm with progressive construction of the test space and the quadrature rule (“incr”), and the twofidelity Algorithm 6 with coarse mesh given by \(\mathcal {T}_{\textrm{hf}}^0= \mathcal {T}_{\textrm{hf}}^{(1)}\) (“incr+MF”). To ensure a fair comparison, we impose that the final ROM has the same number of modes (twenty) for all cases. Training of the coarse ROM in Algorithm 6 is based on the weakgreedy algorithm with progressive construction of the test space and the quadrature rule, with tolerance \(\texttt {tol}=10^{3}\): this leads to a coarse ROM with \(n_0=14\) modes, which corresponds to an initial training set \(\mathcal {P}_\star \) of cardinality 14 in the greedy method (cf. Line 5, Algorithm 6). The ROMs associated with three different training strategies show comparable performance in terms of online cost and \(L^2\) errors.
For the fine mesh \(\mathcal {T}_{\textrm{hf}}^{(6)}\), the twofidelity training leads to a reduction of offline costs of roughly \(25\%\) with respect to the vanilla implementation and of roughly \(10\%\) with respect to the incremental implementation; for the fine mesh \(\mathcal {T}_{\textrm{hf}}^{(5)}\) (which has one half as many elements as the finer grid), the twofidelity training leads to a reduction of offline costs of roughly \(39\%\) with respect to the vanilla implementation and of roughly \(31\%\) with respect to the incremental implementation. In particular, we notice that the initialization based on the coarse model—which is the same for both cases—is significantly more effective for the HF model associated with mesh 5 than for the HF model associated with mesh 6. The empirical finding strengthens the observation made in “Multifidelity sampling” that the choice of the coarse approximation is a compromise between overhead costs—which increase as we increase the size of the coarse mesh—and accuracy of the coarse solution.
We remark that for the first two cases the HF solver is initialized using the solution for the closest parameter in the training set
for \(n=2,3,\ldots \); for \(n=1\), we rely on a coarse solver and on a continuation strategy with respect to the Mach number: this initialization is inherently sequential. On the other hand, in the twofidelity procedure, the HF solver is initialized using the reducedorder solver that is trained using coarse HF data: this choice enables trivial parallelization of the HF solves for the initial parameter sample \(\mathcal {P}_\star \). We hence expect to achieve higher computational savings for parallel computations.
Longtime mechanical response of the standard section of a containment building under external loading
Model problem
We study the longtime mechanical response of a threedimensional standard section of a nuclear power plant containment building: the highlynonlinear mechanical response is activated by thermal effects; its simulation requires the coupling of thermal, hydraulic and mechanical (THM) responses. A thorough presentation of the mathematical model and of the MOR procedure is provided in [31]. The ultimate goal of the simulation is to predict the temporal behavior of several quantities of interest (QoIs), such as water saturation in concrete, delayed deformations, and stresses: these QoIs are directly related to the leakage rate, whose estimate is of paramount importance for the design of NCBs. The deformation field is also important to conduct validation and calibration studies against realworld data.
Following [32], we consider a weak THM coupling procedure to model deferred deformations within the material; weak coupling is appropriate for large structures under normal operational loads. The MOR process is exclusively applied to estimate the mechanical response: the results from thermal and hydraulic calculations are indeed used as input data for the mechanical calculations, which constitute the computational bottleneck of the entire simulation workflow. To model the mechanical response of the concrete structure, we consider a threedimensional nonlinear rheological creep model with internal variables; on the other hand, we consider a onedimensional linear elastic model for the prestressing cables: the state variables are hence the displacement field of the threedimensional concrete structure and of the onedimensional steel cables. We assume that the whole structure satisfies the smallstrain and smalldisplacement hypotheses. To establish connectivity between concrete and steel nodes, a kinematic linkage is implemented: a point within the steel structure and its corresponding point within the concrete structure are assumed to share identical displacements.
We study the solution behavior with respect to two parameters: the desiccation creep viscosity (\(\eta _{\textrm{dc}}\)) and the basic creep consolidation parameter (\(\kappa \)) in the parameter range \(\mu \in \mathcal {P} = [5 \cdot 10^{8}, 5\cdot 10^{10}]\times [10^{5}, 10^{3}] \subset {\mathbb {R}}^2\). Figure 6 shows the behavior of (a) the normal force on a horizontal cable, and (b) the tangential and (c) vertical strains on the outer wall of the standard section of the containment building, for three distinct parameter values \(\mu ^{(i)} = (5.10^{9}, \kappa ^{(i)})\), for \(\kappa ^{(i)} \in \{10^{5}, 10^{4}, 10^{3}\}\), \(i=1,2,3\). Notation “E” indicates that the HF data are associated to the outer face of the structure. Note that the value of the consolidation parameter \(\kappa \) affects the rate of decay of the various quantities.
Results
Highfidelity solver. We consider the two distinct threedimensional meshes, depicted in Fig. 7: a coarse mesh of \(N_{\textrm{e}}=784\) threedimensional hexahedral elements and a refined mesh with \(N_{\textrm{e}}=1600\) elements. This mesh features the geometry of a portion of the building halfway up the barrel, and is crossed by two vertical and three horizontal cables. We consider an adaptive timestepping scheme: approximately 45 to 50 time steps are needed to reach the specified final time step, for all parameters in \(\mathcal {P}\) and for both meshes. Table 2 provides an overview of the costs of the HF solver over the training set for the two meshes: we observe that the wallclock cost of a full HF simulation is roughly nine minutes for the coarse mesh and seventeen minutes for the refined mesh. We consider a 7 by 7 training set \(\mathcal {P}_{\textrm{train}}\) and a 5 by 5 test set; parameters are logarithmically spaced in both directions.
Figure 8 showcases the evolution of normal forces in the central horizontal cable (\(N_{\textrm{H}2}\)), the vertical (\(\varepsilon _{zz}\)) and the tangential (\(\varepsilon _{tt}\)) deformations on the outer surface of the geometry. Table 3 shows the behavior of the maximum and average relative errors
for the three quantities of interest of Fig. 8. We notice that the two meshes lead to nearlyequivalent results for this model problem.
Model reduction. We assess the performance of the Galerkin ROM over training and test sets. Given the sequence of parameters \(\{ \mu ^{\star , it} \}_{it=1}^{\texttt {maxit}}\), for \(it=1,\ldots , \texttt {maxit}\),

1.
we solve the HF problem to find the trajectory \(\{ u_{ \mu ^{\star , it}}^{(k)} \}_{k=1}^K\);

2.
we update the reduced space \(\mathcal {Z}\) using (24) with tolerance \(tol=10^{5}\);

3.
we update the quadrature rule using the (incremental) strategy described in “Extension to unsteady problems”.
Below, we assess (i) the effectiveness of the incremental strategy for the construction of the quadrature rule, and (ii) the impact of the sampling strategy on performance. Towards this end, we compute the projection error
where and is the discrete \(L^2(0, T; \Vert \cdot \Vert )\) norm—as in [31], we consider \(\Vert \cdot \Vert = \Vert \cdot \Vert _2\). Further results on the prediction error and online speedups of the ROM are provided in [31].
Figure 9 illustrates the performance of the EQ procedure; in this test, we consider the finer mesh (mesh 2), and we select the parameters \(\{ \mu ^{\star , it} \}_{it=1}^{\texttt {maxit}=15}\) using the PODstrong greedy algorithm 7 based on the HF results on mesh 1. Figure 9a–c show the number of iterations required by Algorithm 3 to meet the convergence criterion, the computational cost, and the percentage of sampled elements, which is directly related to the online costs, for \(\delta =10^{4}\). As for the previous test case, we observe that the progressive construction of the quadrature rule drastically reduces the NNLS iterations without hindering performance. Figure 9d shows the speedup of the incremental method for three choices of the tolerance \(\delta \) and for several iterations of the iterative procedure. We notice that the speedup increases as the iteration count increases: this can be explained by observing that the percentage of new columns added during the itth step in the matrix \(\textbf{G}\) decays with it (cf. (25)). We further observe that the speedup increase as we decrease the tolerance \(\delta \): this is justified by the fact that the number of iterations required by Algorithm 3 increases as we decrease \(\delta \).
Figures 10 and 11 investigate the efficacy of the greedy sampling strategy. Figure 10 shows the parameters \(\{ \mu ^{\star ,j} \}_j\) selected by Algorithm 7 for (a) the coarse mesh (mesh 1) and (b) the refined mesh (mesh 2): we observe that the majority of the sampled parameters are clustered in the bottom left corner of the parameter domain for both meshes. Figure 11 shows the performance (measured through the projection error (34)) of the two samples depicted in Figure 10 on training and test sets; to provide a concrete reference, we also compare performance with five randomlygenerated samples. Interestingly, we observe that for this model problem the choice of the sampling strategy has little effects on performance; nevertheless, also in this case, the greedy procedure based on the coarse mesh consistently outperforms random sampling.
Conclusions
The computation burden of the offline training stage remains an outstanding challenge of MOR techniques for nonlinear, nonparametricallyaffine problems. Adaptive sampling of the parameter domain based on greedy methods might contribute to reduce the number of offline HF solves that is needed to meet the target accuracy; however, greedy methods are inherently sequential and introduce nonnegligible overhead that might ultimately hinder the benefit of adaptive sampling. To address these issues, in this work, we proposed two new strategies to accelerate greedy methods: first, a progressive construction of the ROM based on HAPOD to speed up the construction of the empirical test space for LSPG ROMs and on a warm start of the NNLS algorithm to determine the empirical quadrature rule; second, a twofidelity sampling strategy to reduce the number of expensive greedy iterations.
The numerical results of “Numerical results” illustrate the effectiveness and the generality of our methods for both steady and unsteady problems. First, we found that the warm start of the NNLS algorithm enables a nonnegligible reduction of the computational cost without hindering performance. Second, we observed that the sampling strategy based on coarse data leads to nearoptimal performance: this result suggests that multifidelity algorithms might be particularly effective to explore the parameter domain in the MOR training phase.
The empirical findings of this work motivate further theoretical and numerical investigations that we wish to pursue in the future. First, we wish to analyze the performance of multifidelity sampling methods for MOR: the ultimate goal is to devise a priori and a posteriori indicators to drive the choice of the mesh hierarchy and the cardinality \(n_0\) of the set \(\mathcal {P}_\star \) in Algorithm 6. Second, we plan to extend the two elements of our formulation—progressive ROM generation and multifidelity sampling—to optimization problems for which the primary objective of model reduction is to estimate a suitable quantity of interest (goaloriented MOR): in this respect, we envision to combine our formulation with adaptive techniques for optimization [33,34,35].
Availability of data and materials.
The data that support the findings of this study are available from the corresponding author, TT, upon reasonable request. The software developed for the investigations of section 4.2 is expected to be integrated in the opensource Python library Mordicus [36], which is funded by a ‘French Fonds Unique Interministériel” (FUI) project.
Notes
The computational cost reduction is not as significant as the reduction in the total number of iterations, because the cost per iteration depends on the size of the leastsquare problem to be solved (cf. Line 11, Algorithm 3), which increases as we increase the cardinality of the active set.
References
Carlberg K, Farhat C, Cortial J, Amsallem D. The GNAT method for nonlinear model reduction: effective implementation and application to computational fluid dynamics and turbulent flows. J Comput Phys. 2013;242:623–47.
Taddei T, Zhang L. Spacetime registrationbased model reduction of parameterized onedimensional hyperbolic PDEs. ESAIM Math Model Numer Anal. 2021;55(1):99–130.
Ryckelynck D. Hyperreduction of mechanical models involving internal variables. Int J Numer Method Eng. 2009;77(1):75–89.
Farhat C, Chapman T, Avery P. Structurepreserving, stability, and accuracy properties of the energyconserving sampling and weighting method for the hyper reduction of nonlinear finite element dynamic models. Int J Numer Method Eng. 2015;102(5):1077–110.
Yano M, Patera AT. An LP empirical quadrature procedure for reduced basis treatment of parametrized nonlinear PDEs. Comput Methods Appl Mech Eng. 2019;344:1104–23.
Veroy K, Prud’Homme C, Rovas D, Patera A. A posteriori error bounds for reducedbasis approximation of parametrized noncoercive and nonlinear elliptic partial differential equations. In: 16th AIAA Computational Fluid Dynamics Conference, p. 3847, 2003.
Haasdonk B, Ohlberger M. Reduced basis method for finite volume approximations of parametrized linear evolution equations. ESAIM Math Model Numer Anal. 2008;42(2):277–302.
Sirovich L. Turbulence and the dynamics of I. Coherent structures. Coherent Struct Quart Appl Math. 1987;45(3):561–71.
Volkwein S. Model reduction using proper orthogonal decomposition. Lecture Notes, Institute of Mathematics and Scientific Computing, University of Graz. see http://www.unigraz.at/imawww/volkwein/POD.pdf. 1025; 2011.
Cohen A, DeVore R. Approximation of highdimensional parametric PDEs. Acta Numer. 2015;24:1–159.
Himpe C, Leibner T, Rave S. Hierarchical approximate proper orthogonal decomposition. SIAM J Sci Comput. 2018;40(5):3267–92.
Quarteroni A, Manzoni A, Negri F. Reduced basis methods for partial differential equations: an introduction, vol. 92. Berlin: Springer; 2015.
Chapman T, Avery P, Collins P, Farhat C. Accelerated mesh sampling for the hyper reduction of nonlinear computational models. Int J Numer Method Eng. 2017;109(12):1623–54.
Feng L, Lombardi L, Antonini G, Benner P. Multifidelity error estimation accelerates greedy model reduction of complex dynamical systems. Int J Numer Method Eng. 2023;124(3):5312–33.
PaulDuboisTaine A, Amsallem D. An adaptive and efficient greedy procedure for the optimal training of parametric reducedorder models. Int J Numer Method Eng. 2015;102(5):1262–92.
Barral N, Taddei T, Tifouti I. Registrationbased model reduction of parameterized PDEs with spatioparameter adaptivity. J Comput Phys. 2023;112727.
Benaceur A, Ehrlacher V, Ern A, Meunier S. A progressive reduced basis/empirical interpolation method for nonlinear parabolic problems. SIAM J Sci Comput. 2018;40(5):2930–55.
Barrault M, Maday Y, Nguyen NC, Patera AT. An empirical interpolation method: application to efficient reducedbasis discretization of partial differential equations. CR Math. 2004;339(9):667–72.
Conti P, Guo M, Manzoni A, Frangi A, Brunton SL, Kutz JN. Multifidelity reducedorder surrogate modeling. arXiv preprint arXiv:2309.00325 2023.
Yano M, Modisette J, Darmofal D. The importance of mesh adaptation for higherorder discretizations of aerodynamic flows. In: 20th AIAA Computational Fluid Dynamics Conference, p. 3852, 2011.
Lawson CL, Hanson RJ. Solving least squares problems. SIAM, 1995.
Yano M. Discontinuous Galerkin reduced basis empirical quadrature procedure for model reduction of parametrized nonlinear conservation laws. Adv Comput Math. 2019;45(5):2287–320.
Du E, Yano M. Efficient hyperreduction of highorder discontinuous Galerkin methods: elementwise and pointwise reduced quadrature formulations. J Comput Phys. 2022;466: 111399.
Urban K, Patera A. An improved error bound for reduced basis approximation of linear parabolic problems. Math Comput. 2014;83(288):1599–615.
Haasdonk B. Reduced basis methods for parametrized PDEs—a tutorial introduction for stationary and instationary problems. Model Reduct Approx Theory Algo. 2017;15:65.
Iollo A, Sambataro G, Taddei T. An adaptive projectionbased model reduction method for nonlinear mechanics with internal variables: application to thermohydromechanical systems. Int J Numer Method Eng. 2022;123(12):2894–918.
Brand M. Fast online svd revisions for lightweight recommender systems. In: Proceedings of the 2003 SIAM International Conference on Data Mining, pp. 37–46, 2003, SIAM.
MATLAB: R2022a. The MathWorks Inc., Natick, Massachusetts. 2022.
Finite Element code_aster, Analysis of Structures and Thermomechanics for Studies and Research. Electricité de France (EDF), Open source on www.codeaster.org (1989–2024).
Taddei T. Compositional maps for registration in complex geometries. arXiv preprint arXiv:2308.15307 2023.
Agouzal E, Argaud JP, Bergmann M, Ferté G, Taddei T. Projectionbased model order reduction for prestressed concrete with an application to the standard section of a nuclear containment building. arXiv preprint arXiv:2401.05098 2024.
Bouhjiti D. Analyse probabiliste de la fissuration et du confinement des grands ouvrages en béton armé et précontraintapplication aux enceintes de confinement des réacteurs nucléaires (cas de la maquette vercors). Acad J Civil Eng. 2018;36(1):464–71.
Zahr MJ, Farhat C. Progressive construction of a parametric reducedorder model for PDEconstrained optimization. Int J Numer Method Eng. 2015;102(5):1111–35.
Alexandrov NM, Lewis RM, Gumbert CR, Green LL, Newman PA. Approximation and model management in aerodynamic optimization with variablefidelity models. J Aircraft. 2001;38(6):1093–101.
Yano M, Huang T, Zahr MJ. A globally convergent method to accelerate topology optimization using onthefly model reduction. Comput Methods Appl Mech Eng. 2021;375: 113635.
Mordicus Python Package. Consortium of the FUI Project MOR DICUS. Electricité de France (EDF), Open source on https://gitlab.com/mor dicus/mordicus. 2022.
Acknowledgements
The authors thank Dr. JeanPhilippe Argaud, Dr. Guilhem Ferté (EDF R &D) and Dr. Michel Bergmann (Inria Bordeaux) for fruitful discussions. The first author thanks the code aster development team for fruitful discussions on the FE software employed for the numerical simulations of section 4.2.
Funding
This work was partly funded by ANRT (French National Association for Research and Technology) and EDF.
Author information
Authors and Affiliations
Contributions
Eki Agouzal: methodology, software, investigation. Tommaso Taddei: conceptualization, methodology, software, investigation, writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Agouzal, E., Taddei, T. Accelerated construction of projectionbased reducedorder models via incremental approaches. Adv. Model. and Simul. in Eng. Sci. 11, 8 (2024). https://doi.org/10.1186/s40323024002635
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s40323024002635