On the use of neural networks to evaluate performances of shell models for composites

This paper presents a novel methodology to assess the accuracy of shell finite elements via neural networks. The proposed framework exploits the synergies among three well-established methods, namely, the Carrera Unified Formulation (CUF), the Finite Element Method (FE), and neural networks (NN). CUF generates the governing equations for any-order shell theories based on polynomial expansions over the thickness. FE provides numerical results feeding the NN for training. Multilayer NN have the generalized displacement variables, and the thickness ratio as inputs, and the target is the maximum transverse displacement. This work investigates the minimum requirements for the NN concerning the number of neurons and hidden layers, and the size of the training set. The results look promising as the NN requires a fraction of FE analyses for training, can evaluate the accuracy of any-order model, and can incorporate physical features, e.g., the thickness ratio, that drive the complexity of the mathematical model. In other words, NN can trigger fast informed decision-making on the structural model to use and the influence of design parameters without the need of modifying, rebuild, or rerun an FE model.


Introduction
Shell finite elements (FE) are standard options to model two-dimensional (2D) curved structures. In commercial codes, shell FE have the assumptions of the classical theories [1][2][3] leading to up to six degrees of freedom (DOF) per node. Such assumptions may be too restrictive in the case of composite structures in which the high transverse deformability and the transverse anisotropy require the proper modeling of shear and normal transverse stresses, and variations of the displacement field at the interface between two layers with different mechanical properties, i.e., the Zig-Zag effect [4]. 3D FE can incorporate such effects but can lead to prohibitive computational costs due to severe aspect ratio constraints. 2D FE remain computationally more efficient and attractive and, over the years, many strategies emerged to extend their capabilities via, for instance, the use of higher-order polynomial thickness expansions leading to increasing DOF per node [5]. This paper presents a new methodology to assess shell FE for linear static analyses of composites, and the following literature survey focuses on this specific area. More comprehensive reviews are in [6][7][8][9].
Another powerful approach is the Proper Generalized Decomposition (PGD) method [69,70] in which the construction of the refined model and the solution of the problem take place simultaneously.
From the structural standpoint, the methodology in this paper adopts the Carrera Unified Formulation (CUF) allowing to obtain any-order shell theory without formal changes in the problem matrices [4,71,72]. One of the capabilities of CUF is the axiomatic/asymptotic method (AAM) [73,74] to analyze the relevance of any generalized displacement variable. The systematic use of AAM leads to the definition of the Best Theory Diagram (BTD), i.e., a 2D plot to localize shell models with minimum DOF and maximum accuracies [75,76]. One of the aims of this paper is to reduce the computational costs to obtain BTD via neural networks (NN). Such networks are mathematical models inspired by biological nervous systems and composed of simple computational units interlinked by a system of connections [77] to learn through training via samples. In this paper, CUF FE provides the samples for the supervised learning of multilayer perceptrons to evaluate the accuracy of refined shell models avoiding FE matrices and analyses. The use of NN in structural and material simulation is increasing due to the superior computational efficiency [78][79][80]. Recent applications for composites concern the prediction of the elastic properties [81], buckling load [82,83], failure strength [84,85], natural frequencies [86][87][88], and geometry optimization [89].
In this paper, "Finite element formulation" section provides a brief theoretical description of CUF and its FE formulation. "Best Theory Diagram" section introduces the concept of BTD. "Neural networks and coding" section describes the use of NN to evaluate the accuracy of a shell model. Results and conclusions are in "Results" and "Conclusions" sections, respectively.

Finite element formulation
The CUF displacement field for a 2D model is The Einstein notation acts on τ . u is the displacement vector, (u x u y u z ) T . F τ are the thickness expansion functions. u τ is the vector of the generalized unknown displacements. M is the number of expansion terms. A fourth-order model, referred to as N = 4, is and has 15 nodal DOF. The order and type of expansion is a free parameter; thus, the theory of structure is an input of the analysis. The metric coefficients H k α , H k β and H k z of the kth layer are R k α and R k β are the principal radii of the middle surface of the kth layer, A k and B k the coefficients of the first fundamental form of k , see Fig. 1. This paper focused only on shells with constant radii of curvature with A k = B k = 1. The geometrical relations are where The stress-strain relations are where The FE formulation uses a nine-node shell element based on the Mixed Interpolation of Tensorial Component (MITC) method [90]. The displacement vector becomes u τ i and δu sj are the nodal displacement vector and its virtual variation, respectively. The strain expression becomes MITC contrasts the membrane and shear locking via a specific interpolation strategy for the strain components on the nine-node shell element, as follows: Strains αα m1 , ββ m2 , αβ m3 , αz m1 , and βz m2 stem from 10 and The 3 × 3 matrix k k τ sij is the fundamental mechanical nucleus whose expression is independent of the order of the expansion. p k sj is the load vector. More details regarding the finite element formulation are in [72].

Best Theory Diagram
One of the CUF capabilities is the axiomatic/asymptotic method (AAM) to evaluate the relevance of generalized variables and the accuracy of structural theories [73,74]. The fourth-order, equivalent single layer shell model, is the reference model of this paper and all the theories evaluated stem from the combinations of the full fourth-order expansion, i.e., 2 15 models. The CUF generates the governing equations for the theories considered. In particular, the CUF generates reduced models having combinations of the starting terms as generalized unknowns. Two parameters can identify a theory, namely, the number of active terms and the error or accuracy provided. The Best Theory Diagram (BTD) is the curve composed of all models providing the minimum error with the least number of variables, see Fig. 3. Given the accuracy, models with fewer variables than those on the BTD do not exist. Given the number of variables, models with better accuracy than those on the BTD do not exist. In this paper, the error refers to the maximum transverse Table 1 Examples of shell models assessed DOF u x1 u y1 u z1 u x2 u y2 u z2 u x3 u y3 u z3 u x4 u y4 u z4 u x5 u y5 u z5 The combined use of CUF and AAM allows the evaluation of the accuracy of any finite element, as shown in Table 1. Black and white triangles indicate active and inactive generalized displacement variables, respectively, and DOF the nodal degrees of freedom of the element. N = 4 is the full expansion of fourth-order. Other three models, well-known from literature, have incomplete expansions, namely, • The First-Order Shear Deformation Theory (FSDT) with five DOF, • A seven DOF model with parabolic transverse displacement, referred to as PTD, • A nine DOF model with third-order in-plane displacements referred to as TSDT, (17) Neural networks and coding CUF FE analyses generate inputs to train NN. In this paper, the inputs are the structural theories and the thickness ratio, and outputs are the maximum transverse displacements. Figure 4 shows the two ways adopted in this paper to build the BTD, i.e., • CUF generates the governing FE equations for all the shell theories stemming from subsets of the fourth-order expansions. Given that the expansion has 15 terms, overall, 2 15 FE shell models are available. For instance, FSDT is one of these models in which five terms are active-u x1 , u y1 , u z1 , u x2 , and u y2 -and ten inactive. • The FE way runs 2 15 static FE analyses and reports the error and number of active terms of each case in a 2D plot. • The NN way runs one-tenth of the FE analyses and uses them for training. Then, the 2D plot stems from querying the trained NN with all 2 15 shell models.

Fig. 4 CUF and NN framework
• If a/h is a training variable, and, e.g., three a/h values are available, the overall number of analyses is 3 × 2 15 , and the query of the NN includes the shell model and the thickness ratio.
The aim is to build the BTD with less than 2 15 analyses and avoid new FE analyses as the thickness ratio changes. In Fig. 4, the NN training set has 10% of all analyses as this is a typical value used in this paper. Also, the figure shows only one hidden layer, although more layers could be useful.
The NN configuration is a multilayer feed-forward with early stopping and mean squared error as the objective function. Each layer has ten neurons. This paper adopts Levenberg-Marquardt training functions [91]. The input coding is a vector with 16 elements, that is, all the fourth-order expansion generalized displacement variables and the thickness ratio. Each generalized variable is either '1' or '0' to indicate its active or inactive status. Each input has an associated output composed by a vector containing the error, Eq. 14. As an example, the following equation shows the coded input of a generic shell model with h/a = 0.1: Table 2 presents the computational costs of the various processes involved in this paper. The cost normalization used the most expensive process as the reference. The number of layers was chosen via a convergence analysis as the adoption of more than one layer led to     negligible increments of the computational cost. For the type of NN adopted here, the use of 1-3 layers is a standard choice [91]. The data training generation used a random selection of the structural theories, and no significant variations in the results were observed between different set of randomly chosen training sets. Table 3 shows an overview of the analyses employed to obtain the BTD. In all cases, the input is the structural theory. The capability of setting the theory as an input is a feature provided by CUF. As mentioned in previous sections, CUF allows one to handle the kinematics with no restrictions concerning the order and type of expansions adopted. Such a capability is decisive to obtain the BTD as a tool to verify the accuracy of any structural theory. In other words, via the BTD, the effect of the addition of a new generalized variable can be estimated. As the structural theory to be verified is set, the FE option requires the computation of the stiffness matrix and the solution of the linear static analysis. On the other hand, the trained NN can provide the output by encoding the structural model.
The use of NN has the aim to overcome two current limitations of BTD. First, the computational cost required can be very high as thousands of analyses are needed, and the complexity of the problem increases. Then, the evaluation of the BTD becomes even more challenging as various problem characteristics vary, e.g., boundary conditions or material properties, and multiple outputs are considered, e.g., displacements and stresses. The use of NN may be a solution to both issues. This paper aims to address the first limitation and partially handling the second one. To address the second issue comprehensively, other NN architectures are needed, e.g., convolutional NN as they can manage high-dimensional input features with high efficiency [92].

Results
The numerical results focus on cases from [93]. The shell has a = b, R α = R β = R and R/a = 5. The load is bi-sinusoidal and applied on the top surface, p z =p z sin(πα/a) sin(πβ/b). The material properties are E 1 /E 2 = 25, G 12 /E 2 = G 13 /E 2 = 0.5, G 13 /E 2 = 0.2, ν = 0.25. The finite element model of a quarter of shell has a 4 × 4 mesh as this discretization provides sufficiently accurate results [93]. In all cases, the BTD vertical axis ranges from five to fifteen since models with four or less DOF provide very high errors and are not of practical interest. The numerical results stemmed from two methodologies as follows:  • The finite element method, FE, required 2 12 static analyses to build the BTD, i.e., one static analysis per each shell theory having a combination of 12 generalized variables.
To lessen the computational cost, the three zeroth-order terms of the expansion are always active as, usually, their influence is very high. • NN required 10% of 2 12 to train, i.e., some 400 static analyses. Depending on the cases, the architecture of the network had one or three layers and ten neurons per layer. The first numerical case refers to a simply-supported shell with symmetric lamination. Table 4 shows the reference values of transverse displacements adopted to build the BTD. The current N = 4 model provides good accuracy, although, for thicker shells, the match with 3D solutions is not perfect. However, for the scope of the paper, its accuracy is sufficient. First, the analysis concerned the choice of the network parameters. Figures 5 and 6 show the BTD from NN via 5 and 10 neurons and using 5% and 10% of the 2 12 cases for training. Table 5 presents some particular cases focused on structural theories from the literature. The FE BTD serves as a benchmark. The results show that the use of 10 neurons and 10% of cases provides very good matches. The remaining analyses made use of such network architecture and focused on the effect of the thickness ratio on the BTD, given that, as seen in previous papers like [76], this is the most relevant parameter to determine the sets of most important generalized variables. Figure 7 shows the results for a/h = 100 in which (a) reports the accuracy of all 2 12 Table 7 BTD models, 0/90/0, a/h = 100 DOF u x1 u y1 u z1 u x2 u y2 u z2 u x3 u y3 u z3 u x4 u y4 u z4 u x5 u y5 u z5          Figures 8 and 9 report the results of a/h = 50 and 10, respectively, and Table 6 presents the numerical values related to the models from literature. The BTD models from NN are in Tables 7, 8 and 9. For instance, the six DOF best model for a/h = 10 is the following: The last row of each table reports the relevance factor of the expansion orders (RF). The RF is the ratio between the number of active instances and the total number of cases. For instance, RF 0 = 1 indicates that the zeroth-order terms are always present in the BTD. The combined information stemming from the previous figures and tables is in Figure 10 for a/h = 10 with the explicit indication of the seven, six, and five DOF best displacement fields. The results suggest that • The proposed NN framework can detect the FE results with satisfactory accuracy.
Two capabilities are relevant, namely, the possibility of using the NN to evaluate   theories from the literature and the ability to cover the discontinuous error range entirely. • The discontinuity in the error range, i.e., the presence of accuracy bands indicates that there may not exist structural theories satisfying a given error requirement. As shown in [76], such gaps widen as the thickness ratio increases. For thin shells, the lower-order terms, i.e., the FSDT variables, play a decisive role, and their absence causes high errors. As the shell is thicker, higher-order terms gain relevance leading to more homogeneous error distributions. • There are no relevant differences in the BTD for a/h = 100 and 50 except that the latter has a broader error range as the five DOF model, coinciding with the FSDT, yields a 2% error. The models from the literature, although not always on the BTD curve, provide satisfactory accuracies. • For a/h = 10, at least six DOF are necessary to have errors smaller than 1% and the variables required to meet such a requirement are the cubic in-plane ones. • The analysis of the RF shows that, as well-known, for thin shells, zeroth-and firstorder variables are the most relevant. As the thickness increases, the third-order terms gain importance with smaller relevance for first-order ones. The NN detected very similar RF as compared to FE from [76], meaning that the prosed framework can detect the accuracy of a given structural model and determine the models on the BTD curve reliably.
Further analyses concerned the comparison of NN with linear regression (LR). LR is computationally cheaper than NN and can provide explicit weights related to each training Table 17 BTD models, 0/90/0, a/h = 25 DOF u x1 u y1 u z1 u x2 u y2 u z2 u x3 u y3 u z3 u x4 u y4 u z4 u x5 u y5 u z5 15  feature. Figures 11 and 12 show the results for two training sets; namely, 10 and 100%. The accuracy of LR is acceptable just in the second case but lower than NN.

0/90/0/90
The second numerical case investigated the effect of an asymmetric lamination on the BTD. All other parameters remained as those of the previous case. Table 10 presents the transverse displacement values with comparisons with other models from literature, when available. Figures 13, 14 and 15 show the BTD from FE and NN, and Table 11 presents the numerical values of the models from literature. For a/h = 50 and 10, the NN had three layers of ten neurons as one layer was not enough to fit the BTD curve that, in these cases, presents a more irregular shape than for a/h = 100. Tables 12, 13 and 14 show the BTD models and relevance factors. Figure 16 shows the BTD curve for a/h = 10 and the displacement field retrieved from Table 14. The results show that • As mentioned, a more complex NN architecture was necessary, and the match between FE and NN BTD is not perfect. Some differences are visible for higher DOF models. However, such differences are still acceptable, given that, in the worst case, remain within the 1% error range. The BTD curve presents various portions having different shapes leading to a more difficult curve fitting. • As in the previous case, a/h = 100 and 50 have similar BTD, and seven DOF are enough to have very low errors with the FSDT providing accurate results. • For a/h = 10, 11 DOF are necessary to have an error lower than 1% with full fourthorder expansions for the in-plane terms. Besides linear terms, the third-order terms are decisive for their absence leading to errors around 10%. • The considered models from the literature provide high accuracy for thin shells. On the other hand, for a/h = 10, only the TSDT can provide acceptable accuracy.

a/h as a training variable
This section concerns the use of the thickness ratio as an additional training variable. The aim is to show the possibility of using NN to test structural theories and obtain results as the typical parameters of the structure change without the need of creating and running a new FE analysis. In this section, the training inputs are 13, i.e., 12 generalized displacement

Conclusions
This paper presented a new approach to evaluating the accuracy of shell models for composites via the use of neural networks (NN). The NN training used results from shell finite elements (FE) stemming from the Carrera Unified Formulation (CUF) and adopting a 15 DOF, fourth-order polynomial expansion along the thickness as the reference solution.
The first set of training inputs considered one-tenth of the combinations of active and inactive terms, i.e., keeping the constant terms always active, one-tenth of 2 12 shell theories. The second set of inputs added the thickness ratio as a further variable. In all cases, the target was the maximum transverse displacement of a square, simply-supported shell under bi-sinusoidal transverse load. The NN architecture ranged from one to three layers, with ten neurons each. The result verification exploited the FE results of all 2 12 cases. The  DOF u x1 u y1 u z1 u x2 u y2 u z2 u x3 u y3 u z3 u x4 u y4 u z4 u x5 u y5 u z5 NN provided the Best Theory Diagram (BTD), i.e., a curve giving the computationally cheapest model for a given accuracy. The BTD permits to evaluate the accuracy of any structural model and provides guidelines on the relevance of each generalized displacement variable. The main findings of this paper are the following: • The use of NN proved to be valid as matched very well the FE solutions. The main convenience of NN is in the use of some 10% of FE analyses for training to obtain the BTD and evaluate the accuracy of a structural model without the need for further FE analyses. • The NN training can incorporate physical features of the problems such as the thickness ratio allowing to obtain results without the need of new FE analyses and preprocessing. • Potential critical aspects of this approach emerged as the training considered simultaneously thin and thick shells. Such a scenario required the use of more hidden layers and needs further investigations.
• The BTD stemming from the NN matched very well those obtained via FE and presented in [76]. As well-known, the third-order in-plane terms are the most relevant variables to include to refine classical theories. • Most of the models from literature provides good accuracy, although increments in thickness or asymmetric laminations can make such models inaccurate.
The combined use of CUF and NN is promising given that the former can provide thousands of data sets in minutes and benchmarking for the rigorous assessment of results, the latter can boost the computational efficiency and widen the applicability of virtual modeling. Future investigations should focus on the use of NN for multiple targets, e.g., multi-point stress values, and inverse problems to establish the best model by inputting the accuracy requirement.