Deep convolutional architectures for extrapolative forecasts in time-dependent flow problems

Bhatt, Pratyush; Kumar, Yash; Soulaïmani, Azzeddine

doi:10.1186/s40323-023-00254-y

Research article
Open access
Published: 30 November 2023

Deep convolutional architectures for extrapolative forecasts in time-dependent flow problems

Advanced Modeling and Simulation in Engineering Sciences volume 10, Article number: 17 (2023) Cite this article

647 Accesses
Metrics details

Abstract

Physical systems whose dynamics are governed by partial differential equations (PDEs) find numerous applications in science and engineering. The process of obtaining the solution from such PDEs may be computationally expensive for large-scale and parameterized problems. In this work, deep learning techniques developed especially for time-series forecasts, such as LSTM and TCN, or for spatial-feature extraction such as CNN, are employed to model the system dynamics for advection-dominated problems. This paper proposes a Convolutional Autoencoder(CAE) model for compression and a CNN future-step predictor for forecasting. These models take as input a sequence of high-fidelity vector solutions for consecutive time steps obtained from the PDEs and forecast the solutions for the subsequent time steps using auto-regression; thereby reducing the computation time and power needed to obtain such high-fidelity solutions. Non-intrusive reduced-order modeling techniques such as deep auto-encoder networks are utilized to compress the high-fidelity snapshots before feeding them as input to the forecasting models in order to reduce the complexity and the required computations in the online and offline stages. The models are tested on numerical benchmarks (1D Burgers’ equation and Stoker’s dam-break problem) to assess the long-term prediction accuracy, even outside the training domain (i.e. extrapolation). The most accurate model is then used to model a hypothetical dam break in a river with complex 2D bathymetry. The proposed CNN future-step predictor revealed much more accurate forecasting than LSTM and TCN in the considered spatiotemporal problems.

Introduction

Efficient numerical simulations of complex dynamical systems are needed to seek solutions at different times or parameter instances, especially in fluid dynamics. These systems are typically described by a set of parameterized nonlinear partial differential equations (PDEs). Obtaining numerical solutions using a high-fidelity (finite element, finite volume, or finite difference type) computational solver may be extremely expensive, as they must create high-dimensional renderings of the solution to precisely resolve the spatial-temporal multifolds and inherent non-linearities. This method thus becomes inefficient for applications such as optimization and uncertainty quantification, where numerous simulations are required for such analysis. Reduced-order models (ROMs) are suitable substitutions for computationally expensive numerical solvers, as these methods generate a low-ranked structure of the high-dimensional snapshots, which are then utilized to model the spatiotemporal dynamics of the PDE system. Among the various ROM techniques that have been developed, projection-based ROMs are the type employed most extensively. The method involves the generation of a reduced set of basis functions or modes such that their linear superposition effectively overlaps a low-rank approximation of the solutions. Proper Orthogonal Decomposition (POD) is the most popular method among the reduced basis class. POD utilizes singular value decomposition (SVD) to generate an empirical basis of dominant orthonormal modes to obtain an optimum linear subspace in which to project the system-governing PDEs [1, 2]. Availability of the governing equations is necessary to employ intrusive ROM techniques such as the Galerkin projection [3], or the Petrov-Galerkin projection [4], which produce an interpretable ROM defined by high-energy or dominant modes. However, scenarios where the governing equations are unavailable, require the application of data-driven methods, such as non-intrusive ROM (NIROM) [5, 6]. In a NIROM, the expansion coefficients for the reduced solution are obtained via interpolation on the reduced basis space spanned by the set of dominant modes. However, since the reduced dynamics generally belong to nonlinear, manifolds, a variety of interpolation and regression methods have been proposed, capable of enforcing the constraints characterizing those manifolds. Some of the methods most often employed are dynamic mode decomposition [7,8,9], radial basis function interpolation [10, 11] and Gaussian process regression [12, 13]. The recent advancements in machine learning (ML) methods [14] have given rise to revolutionary approaches that effectively evaluate and expedite existing numerical models or solvers by using online-offline computational stages. In the offline stage, the ML model updates its weights or coefficients (training) to learn the system dynamics by using the high-fidelity solutions obtained by the numerical solver, hence requiring computational power and time. In the online stage, the model uses the pre-computed/optimized weights (from the training) to obtain the solution (prediction) for a new set of input instances and does so almost instantly with minimal computational cost. Various data-driven ML-based frameworks have been proposed to model the propagation of system dynamics in latent space. Some of the more highly successful examples involve the use of deep neural networks (DNNs) [15], long-short-term memory (LSTM) networks [16,17,18,19], neural ordinary differential equations (NODE) [19,20,21], and temporal convolutional networks (TCNs) [22, 23].

Significant work has been carried out recently on predicting solution instances outside the training domain for a variety of fluid problems with discontinuities, wave propagation, and advection-dominated flows. Liu et al. [24] presented a predictive data assimilation framework based on the Ensemble Kalman Filter (EnKF) and the DDROM model, which uses an autoencoder network for the compression of high-dimensional dynamics to lower dimensional space and then the LSTM method to model the fluid dynamics in the latent space. The model capabilities were estimated using 2D Burgers’ equation and flow past a cylinder test case. Maulik et al. [25] proposed a Convolutional Autoencoder (CAE) for compression and a recurrent LSTM network for the time evolution on the reduced space. The CAE-LSTM model was capable of reconstructing the sharp profile of the advecting Burgers’ equation more accurately than the POD-Galerkin technique. Dutta et al. [18] utilized an advection-aware (AA) autoencoder network that learns nonlinear embeddings of the high-fidelity system snapshots using an arbitrary snapshot from the dataset, and then models the latent space dynamics using LSTM network to make predictions for the linear advection and Burgers’ problem. Cheng et al. [26] used the POD-ANN model, in which they performed a priori dimension reduction on the high-fidelity dataset and parameterization with an artificial neural (ANN) network to solve the strongly non-linear Allen-Cahn equations and the cylinder flow problem. Heaney et al. [27] proposed an AI-DDNIROM framework, capable of making predictions for spatial domains, significantly larger than the training domain, using a domain decomposition approach, an autoencoder network for low-rank representation, and an adversarial network for making the predictions for flow past a cylinder and slug flow problems. Fatone et al. [19] introduced a $\upmu $t-POD-LSTM ROM framework that is capable of extrapolation for time windows around 15% those of the training domain on unsteady advection–diffusion and unsteady Navier–Stokes equation for new parameter instances. Xu et al. [23] proposed a multi-level framework comprising a convolution autoencoder (CAE), a temporal CAE (TCAE), and a multilayer perceptron (MLP), for the purpose of parameterization, and a TCN network for auto-regressive future state predictions, and evaluated the results on problems such as Sod’s-shock tube and transient ship waves. Wu et al. [22] developed a POD and TCN-based neural network for making predictions on the viscous periodic flow past a cylinder case. Abdedou et al. [28] proposed two CAE architectures to compress the high-dimensional snapshot matrices obtained from numerical solvers for the Burgers’, Stoker’s, and shallow-water equations in space and time and performed parameterization on the compressed latent space. Jacquier et al. [29] employed uncertainty quantification methods—Deep Ensembles and Variational Inference-based Bayesian Neural Networks on the POD-ANN order-reduction method to perform predictions within and outside of the training domain on problems such as shallow water equations for flood prediction, and generated probabilistic flooding maps aware of model uncertainty. Geneva et al. [30] presented a physics-constrained Bayesian auto-regressive CAE network that models non-linear dynamical systems (Kuramoto–Sivashinsky equation, 1D Burgers’, 2D Burgers’) devoid of training data, using only the initial conditions. This reduces the computation cost tremendously and provides uncertainty quantification at each time step.

The caveat that remains is a long-term temporal extrapolation for fluid problems marked by sharp gradients and discontinuities. Our study explores forecasting convolutional architectures (LSTM, TCN, and CNN) to obtain accurate solutions for time steps distant from the training domain, on advection-dominated test cases. The high-dimensional input snapshots matrix is first compressed in space to obtain the reduced latent vectors before they are passed as a sequence to the forecasting models. Two types of architectures are first evaluated for space compression—MLP autoencoder and CAE autoencoder, to identify the one that is more accurate in terms of the reconstruction and preservation of the input information. A simple convolutional architecture is then proposed and shown to provide accurate results for the forecasts.

The subsequent sections of the paper are organized as follows. “Methodology” section describes the dataset structure along with the training and testing strategies, followed by a presentation of the autoencoders for space compression and the forecasting convolutional architectures. In “Results and discussion” section, the models are tested on three numerical cases which are representative of advection-dominated flows—one-dimensional Burgers’ problem, one-dimensional Stoker’s equations, and two-dimensional shallow-water equations to model a dam-break scenario on a real river. Finally, “Conclusion” section presents a summary of the results obtained by the models and some concluding remarks.

Methodology

Dataset

The dataset is comprised of T solution vectors/snapshots: $v^{i}$ with $n_s$ nodes ($v^i \, \in \, {\mathbb {R}}^{n_s}$) at time-steps i $\in $ $\{1, 2,..., T\}$ obtained using a high-fidelity PDE solver. For the autoencoder models, the output is the reconstruction of the input, therefore the training and validation input and output data are snapshot vectors $v^{i}$. For the forecasting models (Fig. 1), N samples are used for training; in each sample, the input is a sequence of $n_{t}$ snapshots (lookback window = $n_t$): $V = [v^{i-n_{t}+1},..., v^{i-1}, v^{i}]$, with $V \, \in \, {\mathbb {R}}^{n_s \times n_t}$, and the corresponding output is the vector at the time-step immediately after the sequence end—$v^{i+1}\, \in \, {\mathbb {R}}^{n_s}$.

For extrapolative testing (Fig. 2), a sequence of $n_{t}$ vectors from the start of the dataset, $V = [v^{1},...,v^{n_t-1}, v^{n_{t}}]\, \in \, {\mathbb {R}}^{n_s \times n_t}$ is fed to the model to produce the vectors at all the subsequent time-steps: $[v^{n_{t}+1}, v^{n_{t}+2},..., v^{T}]\, \in \, {\mathbb {R}}^{n_s \times (T-n_t)}$ in an auto-regressive manner, i.e, first only a single subsequent snapshot $v^{n_{t}+1}$ is predicted, which is then concatenated with previous $n_t-1$ vectors and passed to the forecasting model to produce vector $v^{n_{t}+2}$. This process is repeated in accordance with the desired number of subsequent solution vectors.

Non-intrusive reduced-order modeling

Non-intrusive ROMs (NIROMs) bypass the governing equations and utilize the full-order model solutions to develop a data-driven model, which compresses the full-order data (snapshot) into a reduced-order (latent) space. The method most widely adopted to perform this utilizes deep neural network architectures called autoencoders [31].

An autoencoder learns the approximation of the identity mapping, $\chi $: $v^i \rightarrow v_{ae}^i$ such that $v^i \approx v_{ae}^i$ and $\chi $: ${\mathbb {R}}^{n_s} \rightarrow {\mathbb {R}}^{n_s}$, where ${n_s}$ is the number of nodes in the solution vector $v^i$. This process is accomplished using a two-part architecture. The first part of the autoencoder network is the encoder $\chi _e$, which maps a high-dimensional input vector $v^i$ to a low-dimensional latent vector $z^i$: $z^i = \chi _e (v^i; \theta _e )$ and $z^i$ $\in $ ${\mathbb {R}}^m$ $(m \ll n_s)$. The second part is called a decoder, $\chi _d$, which maps the latent vector $z^i$ to an approximation $v_{ae}^i$ of the high-dimensional input vector $v^i$: $v_{ae}^i$ = $\chi _d (z^i; \theta _d$). The combination of these two parts yields an autoencoder network (Fig. 3) of the form $\chi $: $v^i \rightarrow \chi _d \circ \chi _e(v^i)$. The autoencoder model is trained by computing optimal values of the parameters ($\theta _e$, $\theta _d$) that minimize the reconstruction error over all the training data [18]:

$$\begin{aligned} \theta _e, \theta _d = argmin {\mathcal {L}}(v^i, v_{ae}^i) \end{aligned}$$

(1)

where ${\mathcal {L}}(v^i, v_{ae}^i)$ is a chosen measure of discrepancy between $v^i$ and its approximation $v_{ae}^i$. The restriction (dim($z^i$) = m) $\ll $ (n = dim($v^i$)) forces the autoencoder model to learn the salient features of the input data via compression into a low-dimensional space and to then reconstruct the input, instead of directly learning the identity function. Autoencoder architectures are generally comprised of MLPs (called AAs) [18], convolutional neural network autoencoders (called CAEs) [23, 25, 28], or a combination of both. While small-sized problems can be effectively modeled via an MLP architecture, problems involving data of high spatial complexity require CAE autoencoders for effective and accelerated spatial compression. The architecture of an MLP autoencoder, with two fully connected dense layers (hidden layers) in the encoder network and a mirrored decoder network, is shown in (Fig. 4). The Convolution autoencoder consists of two convolution layers, each followed by batch normalization, swish activation, and an average pooling layer, as described in (Fig. 5).

Forecasting techniques

The dataset “Dataset” post compression by the encoder ($\chi _e$) produces N samples of the form: $Z=[z^{i-n_{t}+1},..., z^{i-1}, z^{i}] \, \in \, {\mathbb {R}}^{m \times n_t}, \, z^{i+1} \in \, {\mathbb {R}}^m$, which are used to train the following forecasting models.

Long short-term memory (LSTM)

LSTM [32] is a special type of recurrent neural network (RNN) that is well-suited for performing regression tasks based on time series data. The main difference between the traditional RNN and the LSTM architecture is the capability of an LSTM memory cell to retain information over time and an internal gating mechanism that regulates the flow of information in and out of the memory cell [33]. The LSTM cell consists of three parts, also known as gates, that have specific functions. The first part called the forget gate, chooses whether the information from the previous step in the sequence is to be remembered or can be forgotten. The second part called the input gate, tries to learn new information from the current input to this cell. The third and final part, called the output gate, passes the updated information from the current step to the next step in the sequence. The basic LSTM equations for an input vector $v^i$ are:

$$\begin{aligned} input\,gate: \zeta _{in} = \alpha _s \circ F_{in}(v^i) \end{aligned}$$

(2)

$$\begin{aligned} forget\,gate: \zeta _{for} = \alpha _s \circ F_{for}(v^i) \end{aligned}$$

(3)

$$\begin{aligned} cell\,state: c_i = \zeta _{for} \odot c_{i-1} + \zeta _{in}\odot (\alpha _t \circ F_a(v^i)) \end{aligned}$$

(4)

$$\begin{aligned} output\,gate: \zeta _{out} = \alpha _s \circ F_{out}(v^i) \end{aligned}$$

(5)

$$\begin{aligned} output: h_i = \zeta _{out}\circ \alpha _t(c_i) \end{aligned}$$

(6)

Here, F refers to a linear transformation defined by a matrix multiplication and bias addition, that is, $ F(v^i) = W v^i + b$, where W $\in $ ${\mathbb {R}}^{h \times n_s}$ is a matrix of layer weights (h is number of neurons in the LSTM cell), b $\in $ ${\mathbb {R}}^{h}$ is a vector of bias values, and $v^i$ $\in $ ${\mathbb {R}}^{n_s}$ is the input vector to the LSTM Cell. Also, $\alpha _s$ and $\alpha _t$ denote sigmoid and hyperbolic tangent activation functions, respectively, which are standard choices in an LSTM network, and $x \odot y$ denotes a Hadamard product of two vectors x and y. The sequence of snapshot vectors of $n_{t}$ time-steps: $V = [v^{i-n_{t}+1},..., v^{i-1}, v^{i}]$, with $V \, \in \, {\mathbb {R}}^{n_s \times n_t}$ trains the LSTM network, with recurrence over time (Fig. 6), to predict the subsequent vector $v^{i+1}$. The core concept of an LSTM network is the cell state $c_i$, which behaves as the “memory” of the network. It can either allow greater preservation of past information, reducing the issues of short-term memory, or it can suppress the influence of the past, depending on the actions of the various gates during the training process.

Temporal Convolution Network (TCN)

The TCN is based on two principles [34]: the network produces an output of the same length as the input, and there can be no leakage from the future into the past. To verify that the first principle is respected, the TCN uses a 1D fully-convolutional network (FCN) where each hidden layer has the same length as the input layer, and zero padding of length $(k -1)$ is added to keep subsequent layers the same length as previous ones. To respect the second principle, the TCN uses causal convolutions (achieved by padding only on the starting side of input sequences), where the output at time i is convolved only with elements from time i and earlier in the previous layer (Fig. 7). A TCN also makes use of dilated convolutions that enable an exponentially large receptive field. For an input sequence, $V \ \in $ ${\mathbb {R}}^{n_s \times n_t}$ and a kernel K with learnable weights, $K \ \in \ {\mathbb {R}}^k$ (k is the kernel size), the element O(s) with s $\in \{0, 1,..., n_t-k+1\}$ produced by the dilated 1D convolution is:

$$\begin{aligned} O(s) = \sum _{j=0}^{k-1}V(s+j*d) \times K(j) \end{aligned}$$

(7)

where d is the dilation factor and k is the kernel size. When using dilated convolutions, d is increased exponentially with the depth of the network (eg., $d = 2^l$ at level l of the network), ensuring that some filter hits each input within a large effective history.

In the TCN model employed here, a generic residual block is used in place of a convolutional layer. A residual block contains a branch leading to a series of transformations obtained by layers of TCNs, whose outputs are added to the input V of the block to obtain $O_{rb}$:

$$\begin{aligned} O_{rb} = Activation(V + F(V)) \end{aligned}$$

(8)

Within a residual block (Fig. 8), the TCN has two layers of dilated causal convolution with weight normalization and non-linearity, with a leaky rectified linear unit (leaky ReLU). To account for different input–output widths during addition operations, a 1D convolution (kernel size = 1 and channels = $n_s$) is used to ensure the element-wise addition operator ($\oplus $) receives tensors of the same shape.

When convolving along the temporal axis, this (standard) TCN model uses information available from all the prior time steps (due to the large receptive field) to evaluate the next time step, as sketched in Fig. 7. The model takes in a sequence of $n_{t}$ vectors corresponding to a look-back window of size $n_{t}$: $V = [v^{i-n_{t}+1},..., v^{i-1}, v^{i}]$, with $V \, \in \, {\mathbb {R}}^{n_s \times n_t}$. The filters convolve along the temporal axis for all the $n_s$ vector nodes since the nodes are passed in as channels. However, the results produced from this model “Results and discussion” section do not propagate beyond the training domain. Therefore, another model is proposed here, where the dilated convolutions of the TCN model convolve along the spatial axis and thus use the information available from the neighboring nodes to determine the future time-step value of the node. This model takes in a sequence of $n_{t}$ vectors corresponding to a look-back window of size $n_{t}$ in a transposed manner, such that the $n_{t}$ solution vectors are on separate channels: $V^T \, \in \, {\mathbb {R}}^{n_t \times n_s}$, where $V = [v^{i-n_{t}+1},..., v^{i-1}, v^{i}]$. This model produces significantly better results than the TCN on a temporal axis, but the causal padding and dilations employed are of no significance when the convolution filter operates along the spatial axis. Another architecture for modeling the system dynamics, with 1D convolutions and without any dilations or causal paddings, is therefore proposed in the following section.

A proposed Convolution Neural Network (CNN) for time forecasting

A convolutional layer convolves filters with trainable weights on the input vector $v^i$ [31]. Such filters are commonly referred to as convolutional kernels. In a convolutional neural network, the inputs and outputs can have multiple channels. For a convolutional layer with $n_{in}$ input channels and $n_{out}$ output channels, the total number of convolutional kernels is $n_{k} = n_{i} \times n_{o}$. Each kernel slides along the spatial direction and the products of kernel weights and vector nodes are computed at all sliding steps. For an input vector $v^i$ and a kernel K, the corresponding output feature map O(s) with s $\in \{0, 1,..., n_s-k+1\}$ (where k is the kernel size) is given by:

$$\begin{aligned} O(s) = \sum _{j=0}^{k-1}v^i(s+j) \times K(j) \end{aligned}$$

(9)

Zero padding of size $(k-1)/2$ is added to both sides of the output feature map to maintain the spatial dimension as $n_s$.

The proposed forecasting model of CNN takes in a sequence of vectors with $n_{t}$ time-steps in a transposed manner as its input: $V^T \, \in \, {\mathbb {R}}^{n_t \times n_s}$, where $V = [v^{i-n_{t}+1},..., v^{i-1}, v^{i}]$, so that the filter convolves on the spatial dimension of size $n_s$, and the $n_{t}$ vectors lie on separate channels, as shown in Fig. 7. The CNN architecture (Fig. 8) consists of X residual blocks (X is a hyperparameter), in which the input to each block, after transformation (to make the channels equal) from a 1D Convolution layer (kernel = 1 and channels = 1) is added to the output from the block. A residual block consists of two convolution layers, each followed by a weight normalization and a leaky ReLU activation layer.

Metrics

To evaluate the performance of the previous architectures, the following metrics are used:

Mean Squared Error ($L_2$ norm): The average of the square of the difference between the actual $v_i$ and predicted values ${\hat{v}}_i$ over N samples:

$$\begin{aligned} MSE = \frac{\sum _{i=1}^N(v_{i}-{\hat{v}}_{i})^2}{N} \end{aligned}$$

(10)

Mean Absolute Error ($L_1$ norm): The average of the difference between the two vectors $v_i$ and ${\hat{v}}_i$ over N samples:

$$\begin{aligned} MAE = \frac{\sum _{i=1}^N \Vert v_{i}-{\hat{v}}_{i}\Vert }{N} \end{aligned}$$

(11)

Relative $L_2$ Norm Error: The relative $L_2$ norm error (referred as error) is calculated as:

$$\begin{aligned} Relative Error = \frac{\sqrt{\sum _{i=1}^N (v_{i}-{\hat{v}}_{i})^2}}{\sqrt{\sum _{i=1}^N v_{i}^2}} \end{aligned}$$

(12)

Results and discussion

The capability of the autoencoders (MLP-AE and CAE) to efficiently transform high-dimensional vectors to a low-dimensional space, and that of the forecasting models (LSTM, TCN, and CNN) to accurately model the system dynamics were tested using advection-dominated flow problems.

1D Burgers’ problem

The test case involves the one-dimensional Burgers’ equation, which is a non-linear advection–diffusion PDE. The equation along with the initial and Dirichlet boundary conditions are given by

$$\begin{aligned} \frac{\partial u}{\partial t} + u\frac{\partial u}{\partial x} = \nu \frac{\partial ^2 u}{\partial t^2} \end{aligned}$$

(13)

$$\begin{aligned} x \in [0, L], u(0, t) = 0 \end{aligned}$$

(14)

$$\begin{aligned} u(x, 0) \equiv u_0 = \frac{x}{1+\sqrt{\frac{1}{t_0}}\exp ({Re\frac{x^2}{4}})} \end{aligned}$$

(15)

where the length $L = 1m$ and the maximum time $T_{max} = 2\,s$. The solutions obtained from the above equations produce sharp gradients even with smooth initial conditions if the viscosity $\nu $ is sufficiently small, due to the advection-dominated behaviour. The analytical solution to the problem is given by:

$$\begin{aligned} u(x, t) = \frac{\frac{x}{t+1}}{1+\sqrt{\frac{t+1}{t_0}}\exp ({Re\frac{x^2}{4t+4}})} \end{aligned}$$

(16)

where $t_0 = \exp ({\frac{Re}{8}}) $ and $Re = 1/\nu $. The high-fidelity solution vectors are generated by directly evaluating the analytical solution over a uniformly discretized spatial domain containing 200 grid points ($n_s = 200$) at 250 uniform time-steps ($T = 250$) for two different values of Re: 300 and 600. The solution vectors obtained are then used to train the autoencoder and forecasting models “Dataset” section. For the autoencoder training, 200 solution vectors are chosen at random time steps, and the remaining 50 are used for validation. For the forecasting model, the training set is comprised of the first 150 compressed samples, each sample containing $n_t$ consecutive solution vectors (i.e. look back window = $n_t$), where $n_t$ is a hyperparameter. The validation set consists of subsequent 10 samples. For testing, $n_t$ latent vectors from the start of the dataset are fed to the forecasting model to predict the subsequent time steps via auto-regression (Table 1).

Table 1 Burgers’ problem: training, validation and testing dataset

Deep convolutional architectures for extrapolative forecasts in time-dependent flow problems

Abstract

Introduction

Methodology

Dataset

Non-intrusive reduced-order modeling

Forecasting techniques

Long short-term memory (LSTM)

Temporal Convolution Network (TCN)

A proposed Convolution Neural Network (CNN) for time forecasting

Metrics

Results and discussion

1D Burgers’ problem

Autoencoders for spatial compression

LSTM model

TCN model

CNN model

1D Stoker’s problem

Autoencoder for spatial compression

LSTM

TCN model

CNN model

Application to a hypothetical dam-break in a river

2D Convolutional autoencoder for spatial compression

2D CNN model

Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Appendix: Heat map plots for Burgers’ and Stoker’s problems

Appendix: Heat map plots for Burgers’ and Stoker’s problems

Rights and permissions

About this article

Cite this article

Share this article

Keywords