Identification of nonlinear behavior with clustering techniques in car crash simulations for better model reduction
 Dennis Grunert^{1}Email author and
 Jörg Fehr^{1}
DOI: 10.1186/s403230160072x
© The Author(s) 2016
Received: 19 February 2016
Accepted: 11 May 2016
Published: 13 June 2016
Abstract
Background:
Car crash simulations need a lot of computation time. Model reduction can be applied in order to gain timesavings. Due to the highly nonlinear nature of a crash, an automatic separation in parts behaving linearly and nonlinearly is valuable for the subsequent model reduction.
Methods:
We analyze existing preprocessing and clustering methods like kmeans and spectral clustering for their suitability in identifying nonlinear behavior. Based on these results, we improve existing and develop new algorithms which are especially suited for crash simulations. They are objectively rated with measures and compared with engineering experience. In future work, this analysis can be used to choose appropriate model reduction techniques for specific parts of a car. A crossmember of a 2001 Ford Taurus finite element model serves as an industrialsized example.
Results:
Since a nonintrusive black box approach is assumed, only heuristic approaches are possible. We show that our methods are superior in terms of simplicity, quality and speed. They also free the user from arbitrarily setting parameters in clustering algorithms.
Conclusion:
Though we improved existing methods by an order of magnitude, preparing them for the use with a full car model, they still remain heuristic approaches that need to supervised by experienced engineers.
Keywords
Crash simulation Nonlinear behavior Black box identification Spectral clustering Kmeans Model reductionBackground
In the modern design process of cars, crash tests are simulated with highly detailed finite element (FE) models on highperformance computing clusters. Spethmann et al. summarize in [1] that in 1998 a simulation was much more cost and time efficient than building a prototype, i.e., 5000 USD and weeks vs. 300.000 USD and half a year. There is reason to believe that this great difference still exists today but with an increased quality of the simulations. On the other hand, prototypes change rapidly nowadays. A major German car manufacture has a database with around 7000 different prototypes for only one car type, which again increases the total computation time. Each of them needs to be checked for crash safety in an early development stage since major changes are impossible later on. On top of that, the design engineers need fast feedback about whether the current prototype is safe before they can continue with further modifications.
This raises the need for model (order) reduction (MOR) in the simulation process. Model reduction simplifies the underlying mathematical model in such a way that the dimension, and thus hopefully the simulation time, is reduced while introducing only an acceptable error. There exist a vast collection of model reduction methods, which can be separated into two groups: linear and nonlinear methods. They differ on whether the underlying differential equation is required to be linear or whether nonlinear terms are allowed. One would expect that only nonlinear reduction techniques should be applied to a car crash due to its highly nonlinear nature. But we will propose ideas to automatically separate the car in one part with presumably linear and another part with presumably nonlinear behavior. It is now possible to apply linear MOR to the first and nonlinear MOR to the second part respectively, cf. [2]. This way, linear methods, which usually have a better ratio between computation timesaving and introduced error, can be used for most parts of the vehicle even though the crash in its entirety has nonlinear behavior. In addition, linear MOR techniques are maturer in comparison to nonlinear MOR. Radermacher and Reese successfully used a similar approach by only reducing parts with elastic material behavior [3].
In this work, we analyze modern clustering techniques for their suitability in model reduction. The clustering is not used as a way of model reduction but as a preprocessing step to identify nonlinear behavior. As described above, the knowledge of this behavior will then be used to combine linear and nonlinear model reduction methods in future work that is not part of this article.
It can be seen that certain shortcomings exist with current methods to identify nonlinear behavior in car crashes. Therefore, the major contribution of this work is an improved clustering algorithm which addresses and solves these shortcomings. This is achieved by either improving existing methods or developing new ones. Considering that the majority of the simulations in automotive development concerns crash safety [1], we will focus on car crash tests in this article.
Setting
The finite element model of a 2001 Ford Taurus from the National Crash Analysis Center [6] serves as an example. The model depicted in Fig. 1 consists of almost one million (mostly shell) elements and was validated against actual hardware crash tests [7]. For simplicity, we will focus our analysis of nonlinear behavior on the crossmember shown in Fig. 2. It consists only of 3789 nodes and experiences large deformations as part of the crumple zone which makes it a suitable candidate.
Overview of crash analysis
Model reduction in mechanics would usually substitute the equation of motion (1) by a differential equation of smaller dimension, the socalled reduced system. Unfortunately, this is not possible with the closedsource software LSDYNA. The same is true for analyzing nonlinear behavior since there is no way to access the function \({\varvec{f}}_\text {int}\). Instead we can only use the simulation data resulting from \({\varvec{q}}(t)\) for different simulation runs, which is basically a black box approach. The import routine for the binary output of LSDYNA to MATLAB was written at the institute.
Several ways to analyze the crash behavior were published for different needs: Running the same simulation twice can result in different outputs due to roundoff errors in parallel computing. In 2005 and 2008, Mei and Thole published an algorithm to detect areas with this scatter [9, 10], which was implemented in the software DiffCrash. Since the complete algorithm is not accessible to the authors, it will not be considered in this article. The SimdataNL project [11, 12], on the other hand, used similar techniques to detect bifurcations and to group similarly behaving nodes. None of them were directly developed to separate a model for subsequent model reduction but they can be accommodated to our needs. Only nodal positions were used in the aforementioned publications. Thus, we will also restrict ourselves to this assumption in order to allow a fair comparison to our improvements.
Clustering
All aforementioned publications use either selfdeveloped or known clustering methods such as kmeans or spectral clustering. Therefore, it is vital to explain what clustering is. Clustering is the process of grouping a large data set by similarity into socalled clusters. It is a subclass of unsupervised learning since there do not exist any predefined categories. The area of application is huge: genome analysis, image segmentation, social networks and consumer analysis, to name only a few.
Despite its broad use, clustering is no Swiss army knife for data classification. In fact, [13] describes many pitfalls: First, there is no exact definition of what a cluster should look like. This lies in the eye of the beholder. Imagine for example all the books in a library as data set. One could group them by genre, by age, by the second letter of the authors family name, by their position in the shelf, by their physical dimensions, etc. Another example is the famous ambiguous optical illusion “My Wife and My MotherinLaw” by William Ely Hill: Some see a young woman, and others see an old lady. The same is true for clustering algorithms that try to identify structure in data. Each algorithm has “its own view” on the data which does not necessarily lead to the same result as the user of the algorithm expects. Additionally, most of the real world data does not constitute of natural clusters, which need to be identified by a clustering algorithm. Instead there are many possible ways to categorize data and all of them have their own right to exist. Thus, clusters are not identified but created by the algorithms. Even in randomly distributed data, clustering algorithms will return a grouping since they enforce structure on the data instead of recognizing natural clusters or deciding if there are any clusters at all. This can also be a pitfall for experienced users. There are not only a large quantity of algorithms to choose from but they usually have several parameters which need to be defined and only have a heuristic meaning. Even the number of clusters needs to be specified most of the time in advance. In fact, there cannot be one algorithm satisfying three simple properties as shown in [14].
For consistent notation in the further description of the algorithms, we assume that n ddimensional data points \(\mathcal {X} = \{{\varvec{x}}_i\}_{i=1}^n \subseteq \mathbb {R}^d\) are given. A clustering algorithm is supposed to group these points into K disjoint clusters \(\mathcal {C} = \{C_k\}_{k=1}^K\). \(\mathcal {C}\) is—mathematically speaking—a partition of \(\mathcal {X}\), i.e., \(C_k \ne \emptyset \) for all k, \(\bigcup _k C_k = \mathcal {X}\) (particularly \(C_k \subseteq \mathcal {X}\) for all k) and \(C_k \cap C_{k'} = \emptyset \) for all \(1 \le k < k' \le K\).
Kmeans clustering
Strictly speaking kmeans refers to the optimization problem (3), which is NPhard [16, 17]. Therefore, the above described Llyod’s algorithm 1 is used to approximate the exact solution and is meant when referring to kmeans in the remainder of this article. The computational complexity of Algorithm 1 is \(\mathcal {O}(n \cdot K \cdot d \cdot \omega )\) with \(\omega \) the number of iterations until satisfactory convergence is achieved in line 6. Even though \(\omega \) can grow exponentially in n [18], it is in average (via smoothed analysis) polynomial in n [19]. For real data, it often can be observed that \(\omega \) does not grow that fast and is considered proportional to n.
Spectral clustering
A more advanced clustering algorithm is spectral clustering, which dates back to 1973. There are several variants like [20] or [21] but we will only describe the socalled unnormalized form as described in [22].

The \(\epsilon \)neighborhood graph weights edges only if the corresponding nodes \({\varvec{x}}_i\) and \({\varvec{x}}_j\) have a distance below \(\epsilon > 0\). Since all remaining edges have a similar distance (below \(\epsilon \)), they can be weighted with 1 instead of \(s_{ij}\). This results in the weights$$\begin{aligned} w_{ij} = {\left\{ \begin{array}{ll} 1 &{}\quad \text {if } {\varvec{x}}_i  {\varvec{x}}_j < \epsilon , \\ 0 &{}\quad \text {if } {\varvec{x}}_i  {\varvec{x}}_j \ge \epsilon . \end{array}\right. } \end{aligned}$$

In the lnearest neighbor graph (\(l \in \mathbb {N}\)), edges are weighted only if one of the nodes is among the lnearest neighbors of the other node, i.e.,$$\begin{aligned} w_{ij} = {\left\{ \begin{array}{ll} s_{ij} \quad \text {if } &{}\{ 1 \le \hat{i} \le n: {\varvec{x}}_{\hat{i}}  {\varvec{x}}_j < {\varvec{x}}_i  {\varvec{x}}_j \} < l \\ \quad \quad \text {or } &{}\{ 1 \le \hat{j} \le n: {\varvec{x}}_i  {\varvec{x}}_{\hat{j}} < {\varvec{x}}_i  {\varvec{x}}_j \} < l, \\ 0 &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

The mutual lnearest neighbor graph is a variant of the (nonmutual) lnearest neighbor graph. The only difference is that both nodes need to be one of the lnearest neighbors of the other node, i.e.,$$\begin{aligned} w_{ij} = {\left\{ \begin{array}{ll} s_{ij} \quad \text {if } &{}\{ 1 \le \hat{i} \le n: {\varvec{x}}_{\hat{i}}  {\varvec{x}}_j < {\varvec{x}}_i  {\varvec{x}}_j \} < l \\ \quad \quad \text {and } &{}\{ 1 \le \hat{j} \le n: {\varvec{x}}_i  {\varvec{x}}_{\hat{j}} < {\varvec{x}}_i  {\varvec{x}}_j \} < l, \\ 0 &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

l in the lnearest neighbor graph chosen around \(\log (n)\) satisfies connectivity in the limit \(n \rightarrow \infty \) for random data points \(\{{\varvec{x}}_i\}_{i=1}^n\).

For \(\epsilon \) in the \(\epsilon \)neighborhood graph, the length of the longest path in a minimal spanning tree of the (fully connected) graph is always a valid choice by definition. Recall that a minimal spanning tree connects all vertices with the least amount of edges. Since Prim’s algorithm [23] for calculating such a tree has complexity \(\mathcal {O}(n^2)\), this step can take a significant amount of time.
The transformation of the data points \({\varvec{x}}_i\) to \(\varvec{y}_i\) allows the identification of nonconvex clusters like rings. At first glimpse, it is not clear how this approach should work at all. Luxburg motivates it as the solution of a relaxed graph cut problem and finds analogies to a random walk as well as perturbation theory [22]. Though we can only rigorously prove some theorems like the relation of zeroeigenvalues of \(\varvec{L}\) to the number of connected components in G, spectral clustering seems to be a valid clustering technique.
The eigenvalue decomposition is the main factor regarding the computation time and renders this approach useless for very large data sets. A thin eigenvalue decomposition of a large matrix is usually computed with an iterative method like the Lanczos algorithm [24]. Its computational complexity depends on the number of iterations needed during the iterative procedure which again depends on the gap between the eigenvalues of the matrix [25]. Though no computational complexity can be given for spectral clustering in general, it takes at least as long as kmeans discussed in the “Kmeans clustering” section—since kmeans is the last step in the algorithm—but should be much higher in practice due to the eigenvalue decomposition. Parallelization of the algorithm [26] and outofsample treatment via the Nyström method [27] on sparse grids [28] are newer approaches to reduce the computational complexity. The memory consumption can be another drawback since \(\varvec{W}\) is only a sparse \(n \times n\) matrix if the parameters \(\epsilon \) and l are chosen small enough in the creation of the similarity graph.
Methods to predict the number K of clusters exist—both for general clustering algorithms and spectral clustering in particular. They are summarized in [22]. As discussed in the introduction of the “Clustering” section, the structure of the data which should be identified by a clustering algorithm depends on the expectation of the user. Thus, there cannot exist any general rule for how to choose K.
Preprocessing
In [11, 12], a method was developed to cluster (finite element) nodes with different moving patterns and intensity across several simulations with small variations in the thickness of the sheet metal. In the end, the clusters were used to analyze the presence of bifurcations. We will present their method in a slightly new setting, define some quality criteria, judge the current approach and improve it.
This method is applied to the frontal crash described in the “Setting” section. For a better understanding, we first consider a simplification shown in Fig. 3: A beam consisting of four nodes impacts a rigid wall, which is rotated by an angle of \(\alpha \) compared to an orthogonal contact. This simulation is repeated two times (\(R=2\)) with angles \(\alpha ^{(1)} = 20^\circ \) and \(\alpha ^{(2)} = 15^\circ \) and the displacements \(d_n^{(r)}\) plotted for each simulation in Fig. 3c. We will call such a plot displacement plot in the following. The red node has a higher displacement in the first simulation due the larger angle. Hence, it is located below the diagonal \(\mathbb {R} \varvec{e}\) with \(\varvec{e} = (1,1)\) in the displacement plot.
Quality of clusters and nonlinear behavior
There needs to be a criterion to judge the quality of the clustering after the preprocessing described above. In [11], model reduction was applied to each cluster. Since no new simulations were done with a reduced system, this approach can be seen as data compression. Thus, the reconstruction error resulting from the projection on the subspace spanned by each cluster was a valid criterion to judge the quality in that case. This is, however, not true in our case. In this article, the clustering should be used for the model reduction of subsequent simulation runs with other parameters that were not part of the training data. Therefore, the real error resulting from the model reduction can only be judged after simulating the reduced model. Otherwise, there would not be any new data but only training data to assess the quality of the clustering and reduction. Error estimators or error bounds can also be useful, though they are not available for every reduction method.
Socalled objective measures only rely on the data itself to rate the quality of a cluster, see [29, 30] for examples. Usually structural data like the cluster density or the separation between clusters is taken into account. While objective measures can be useful for assessing the quality of clusters in general, they are likely not equivalent to the expectations of the user since these expectations can vary a lot as discussed in the “Clustering” section. The other group of cluster indicators are subjective measures that take a specific user expectation into account. In the next section, we define several new subjective measures to assess the nonlinear behavior of a crash.

Geometry, e.g., large deformations, bifurcation or snapthrough;

Physics (material laws), e.g., plasticity;

Boundary conditions, e.g., contact.
Measures
In order to quantify the nonlinear behavior, we define measures based on the deformation and scatter of the nodes. It has been seen that the diagonal of the displacement plot represented by the vector \(\varvec{e} := (1, \ldots , 1) \in \mathbb {R}^R\) plays a fundamental role in the analysis.
Analysis of the preprocessing
First of all, all entries \(d_n^{(r)}\) of every vector \({\varvec{x}}_n\) are above a lower bound of \(899~\text {mm}\). This can be explained with the rigid body movement of the car. Even though the first time step \(t_0\) was chosen to be the beginning of the crash, i.e., the car is in contact with the wall, the crossmember still has some distance to the wall, see Fig. 2 on the left. Therefore, the norm in Eq. (4) is dominated by the xcomponent, the driving direction of the car. The displacements in the other two directions have no significant influence on \(d_n^{(r)}\). Since this error is introduced in the beginning of the workflow, it persists in the clustering algorithm, which clusters the nodes by rigid body movement instead of (mechanical) deformation. Figure 5 also supports the hypothesis that the clustering of the nodes only depends on the xcoordinate: The nodes in the rear of the crossmember belong to the cluster with the highest deformation since their movement is decelerated the least.
Another problem lies in the distribution of the vectors \({\varvec{x}}\). They do not form any natural clusters in Fig. 4, hence clustering algorithms uncontrollably divide the nodes in three more or less connected sets. This questions the purpose of clustering as explained in the “Clustering” section and asks for alternatives.
In this example, we have arbitrarily chosen \(K=3\). There are a few techniques to guess the number of clusters beforehand, but only if there exist any (natural) clusters in the data which is not the case here. Choosing \(K=2\) and just assuming that one cluster represents linear and the other nonlinear behavior will not suffice. Additionally, the mapping of the clusters to the (non)linear area is unclear without any of the measures that are proposed in this article.
Improvements to all these disadvantages will be given in the following.
Improvements
After preliminary work of more theoretical nature and notation, it is now possible to describe and analyze some improvements developed by the authors.
Elimination of rigid body movement
By choosing \([t_0, t_1] = [100, 1000]\,{\text {ms}}\) as the smallest time interval covering the complete crash in the “Analysis of the preprocessing” section, we have already reduced the impact of the rigid body movement. Bohn et al., instead, looked at different time steps including the full simulation, cf. [11].
One could think that shifting the nodes in Fig. 4 to the origin would lead to the same result as subtracting the rigid body movement. This is not true since the above subtraction mostly effects the xcoordinate of \(\varvec{p}\) as the major direction of the rigid body movement resulting in an equal weighting of all three spatial coordinates in (6). Therefore, not only the position of the nodes in the left of Fig. 6 moved to the origin, but the formation of the nodes also changed significantly. On top of that, a simple translation of the nodes would not have influenced the kmeans clustering algorithm anyway since only the relative positions are important for kmeans.
Though the subtraction of the rigid body movement with the help of a reference node already improves the clustering significantly, it can still only identify parts by their deformation behavior. But we are also interested in other sources of nonlinear behavior as described in the “Quality of clusters and nonlinear behavior” section.
Simple alternatives for clustering
With the measures defined in the “Measures” section, it is not only possible to judge existing methods but invent new approaches that use these measures for cluster optimization. This will ensure that the clustering algorithms fit to the measures that are used to assess the quality. We will present several new clustering techniques in the following.
The case \(M = S^\text {rel}\) depicted in Fig. 9 is more interesting since it is the first method that is able to identify areas with higher sensitivity between the simulation runs—another source of nonlinearity. This method allows us to identify the left front part of the crossmember as highly sensitive to small changes in the crash angle. The cluster with the highest sensitivity even has a relative scatter of \(61.5~\%\) as can be seen in the changed legend of Fig. 9.
Grouping
Although we have achieved a strong improvement over the original method described in the “Analysis of the preprocessing” section, it is still an open question which clusters belong to the presumably linearly and nonlinearly behaving areas and how large the number K of clusters should be chosen. We solve both problems at once by grouping the clusters.
Figure 11 shows the result of such a grouping. First, the crossmember is clustered with kmeans like in Fig. 6 but this time into \(K=10\) clusters. The relative scatter of all 10 clusters is measured with \(M = S^\text {rel}_\text {mean}\). \(M_\text {min} = 3.3~\%\) is the lowest and \(M_\text {max} = 66.1~\%\) is the highest relative scatter out of the 10 values. A parameter \(\delta = 0.7\) leads to a threshold of \(\tau = 47.3~\%\) according to (7). Thus, the clusters with the two highest relative scatter values \(S^\text {rel}_\text {mean}\) of \(48.8~\%\) and \(66.06~\%\) are grouped to \(G_\text {nonlin}\) since their measure is above \(\tau \), and the remaining eight clusters are merged into \(G_\text {lin}\). The result can be examined in Fig. 11. It should be mentioned that the grouping method found almost the same area of high scatter as equal clustering with respect to \(S^\text {rel}\), see Fig. 9, even though a different clustering method was used.
Choosing an appropriate \(\delta \) and judging the resulting two groups is still a job that can only be done by an experienced engineer but the measures from the “Measures” section help a lot to automate this part of the workflow. The engineer can first look at the measures of each cluster and get an idea of their range. He can then for example decide that most of the clusters have only a low measure—which is a sign for linear behavior—and choose a high \(\delta \) like 0.9. Another option would be that every cluster with a relative scatter higher than \(10~\%\) should be considered to be behaving nonlinearly. Choosing \(\delta \) by solving (7) for \(\tau = 10~\%\) will lead to that clustering. Dividing the nodes into \(G_\text {lin}\) and \(G_\text {nonlin}\) would be impossible without the measures from the “Measures” section since the clusters from any clustering algorithm do not have any meaning. They are not sorted or labeled in any way. Only the measures from the “Measures” section give them a meaning like “area with high deformation”.
Summary and outlook
It was observed that black box identification of nonlinear behavior can only be solved by heuristics. First, measures were defined to quantify the deformation and scatter, which correlate to nonlinear behavior. With these measures it was possible to judge improvements of existing methods. The subtraction of the rigid body movement is necessary in the preprocessing. New methods like equal clustering, which clusters with respect to a measure, are more robust and faster. Additionally, they can guarantee that the clustering fits to the measures that a user wants to optimize without the need to choose any parameters like in spectral clustering. An optional grouping eliminates the need to estimate the number of clusters. Overall, they all proved to be a valuable contribution in raising the quality of the identification and matched with the experience of engineers.
The algorithms are almost ready to be applied to a full car model. The guaranteed linear time scaling of equal clustering allows the direct application on a full car model. But this can result in clusters that are scattered all over the model. For a later model reduction, it is necessary that the size of the interface between linearly and nonlinearly reduced areas is as small as possible, see [2]. Thus, the geometry of the car needs to be taken into account during the clustering, e.g., all clusters should be connected sets.
The resulting clustering can now be used to apply certain model reduction methods to the clusters in future work. Parts of the model that are considered to behave nonlinearly should be reduced with nonlinear model reduction techniques, the rest with linear methods. This approach is deemed to be advantageous over using only one reduction method. If a crash simulation is only reduced linearly, the error would be higher since nonlinear behavior cannot be reproduced. If nonlinear model reduction is used for the full car model, it would lead to higher computation times due to the usually more complex nonlinear algorithms. This needs to be proven with experiments in future work.
In this paper it was assumed that only the nodal positions are available. If the user has access to more output data like strain or stress, the aforementioned methods can be extended to this new data. It should also be beneficial to combine the results based on deformation and scatter in order to take both sources of nonlinearity into account.
Notes
Declarations
Author's contributions
DG analyzed and improved the existing methods. JF had the idea to separate linearly from nonlinearly behaving parts and implemented most of the import functions from LSDYNA to MATLAB. Both authors read and approved the final manuscript.
Acknowledgements
The authors like to thank AnneKathrin Zeile for preliminary work. Additionally, they would like to thank the German Research Foundation (DFG) for financial support of the project within the Cluster of Excellence in Simulation Technology (EXC 310/1) at the University of Stuttgart. The 2001 Ford Taurus model has been developed by The National Crash Analysis Center (NCAC) of The George Washington University under a contract with the FHWA and NHTSA of the US DOT.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Spethmann P, Thomke SH, Herstatt C. The impact of simulation on productivity and problemsolving in automative R&D. Technical report, Technische Universität HamburgHarburg; 2006.Google Scholar
 Fehr J, Holzwarth P, Eberhard P. Interface and model reduction for efficient simulations of nonlinear vehicle crash models. Accepted by Mathematical and Computer Modelling of Dynamical Systems.Google Scholar
 Radermacher A, Reese S. Model reduction in elastoplasticity: proper orthogonal decomposition combined with adaptive substructuring. Comput Mech. 2014;54(3):677–87. doi:10.1007/s0046601410206.MathSciNetView ArticleMATHGoogle Scholar
 LSDYNA Product Home Page. http://www.lstc.com/products/lsdyna/. Accessed 2 Feb 2016.
 Livermore Software Technology Corporation. LSDYNA theory manual. Livermore: Livermore Software Technology Corporation; 2006.Google Scholar
 National Crash Analysis Center. http://www.ncac.gwu.edu/ncac/. Accessed 2 Feb 2016.
 Marzougui D, Samaha RR, Cui C, Kan CDS. Extended validation of the finite element model for the 2001 ford taurus passenger sedan. In: Technical report NCAC 2012W004, The National Crash Analysis Center, The George Washington University, 45085 University Drive, Ashburn; 2012.Google Scholar
 Craig R. Coupling of substructures for dynamic analyses: an overview. In: Proceedings of the AIAA dynamics specialists conference, PaperID 20001573. Atlanta; 2000.Google Scholar
 Mei L, Thole CA. Clustering algorithms for parallel carcrash simulation analysis. In: Bock H, Phu H, Kostina E, Rannacher R, editors. Modeling, simulation and optimization of complex processes. Berlin: Springer; 2005. p. 331–40.Google Scholar
 Mei L, Thole CA. Data analysis for parallel carcrash simulation results and model optimization. Simul Model Pract Theory. 2008;16(3):329–37.View ArticleGoogle Scholar
 Bohn B, Garcke J, IzaTeran R, Paprotny A, Peherstorfer B, Schepsmeier U, Thole CA. Analysis of car crash simulation data with nonlinear machine learning methods. Procedia Comp Sci. 2013;18:621–30.View ArticleGoogle Scholar
 Griebel M, Bungartz HJ, Czado C, Garcke J, Trottenberg U, Thole CA, Bohn B, IzaTeran R, Paprotny A, Peherstorfer B, Schepsmeier U. SIMDATANL – Nichtlineare Charakterisierung und Analyse von FEMSimulationsergebnissen für Autobauteile und CrashTests. Final report (in German), Bundesministerium für Bildung und Forschung. 2014. http://publica.fraunhofer.de/dokumente/N326423.html. Accessed 16 Feb 2015.
 Jain AK. Data clustering: 50 years beyond kmeans. Pattern Recog Lett. 2010;31(8):651–66.View ArticleGoogle Scholar
 Kleinberg JM. An impossibility theorem for clustering. In: Becker S, Thrun S, Obermayer K, editors. Advances in neural information processing systems, vol. 15. Cambridge: MIT Press; 2003. p. 463–70.Google Scholar
 Jain AK, Dubes RC. Algorithms for clustering data. Upper Saddle River: PrenticeHall Inc; 1988.MATHGoogle Scholar
 Aloise D, Deshpande A, Hansen P, Popat P. NPhardness of euclidean sumofsquares clustering. Mach Learn. 2009;75(2):245–8. doi:10.1007/s1099400951030.View ArticleGoogle Scholar
 Dasgupta S, Freund Y. Random projection trees for vector quantization. IEEE Transact Inform Theory. 2009;55(7):3229–42. doi:10.1109/TIT.2009.2021326.MathSciNetView ArticleGoogle Scholar
 Vattani A. Kmeans requires exponentially many iterations even in the plane. Discrete Comput Geom. 2011;45(4):596–616. doi:10.1007/s0045401193401.MathSciNetView ArticleMATHGoogle Scholar
 Arthur D, Manthey B, Röglin H. Smoothed analysis of the kmeans method. J ACM. 2011;58(5):1–31. doi:10.1145/2027216.2027217.MathSciNetView ArticleMATHGoogle Scholar
 Ng AY, Jordan MI, Weiss Y. On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems, vol 14. MIT Press: Cambridge; 2001. p. 849–56.Google Scholar
 Shi J, Malik J. Normalized cuts and image segmentation. IEEE Transact Pattern Anal Mach Intell. 2000;22(8):888–905.View ArticleGoogle Scholar
 von Luxburg U. A tutorial on spectral clustering. Stat Comput. 2007;17(4):395–416. doi:10.1007/s112220079033z.MathSciNetView ArticleGoogle Scholar
 Prim RC. Shortest connection networks and some generalizations. Bell Syst Tech J. 1957;36(6):1389–401. doi:10.1002/j.15387305.1957.tb01515.x.View ArticleGoogle Scholar
 Lanczos C. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J Res Natl Bur Stand. 1950;45(4):255–82.MathSciNetView ArticleGoogle Scholar
 Kuczyński J, Woźniakowski H. Estimating the largest eigenvalue by the power and lanczos algorithms with a random start. SIAM J Matrix Anal Appl. 1992;13(4):1094–122. doi:10.1137/0613066.MathSciNetView ArticleMATHGoogle Scholar
 Zheng J, Chen W, Chen Y, Zhang Y, Zhao Y, Zheng W. Parallelization of spectral clustering algorithm on multicore processors and GPGPU. In: Computer systems architecture conference, 2008. ACSAC 2008. 13th AsiaPacific. 2008. p. 1–8. doi:10.1109/APCSAC.2008.4625449.
 Bengio Y, Paiement JF, Vincent P, Delalleau O, Roux NL, Ouimet M. Outofsample extensions for LLE, isomap, MDS, eigenmaps, and spectral clustering. In: Thrun S, Saul LK, Schölkopf B, editors. Advances in neural information processing systems, vol. 16. Cambridge: MIT Press; 2004. p. 177–84.Google Scholar
 Peherstorfer B, Pflüger D, Bungartz HJ. A sparsegridbased outofsample extension for dimensionality reduction and clustering with Laplacian eigenmaps. In: Wang D, Reynolds M, editors. Advances in artificial intelligence 2011, vol. 7106. Lecture notes in computer science. Berlin: Springer; 2011. p. 112–21.Google Scholar
 Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987;20:53–65. doi:10.1016/03770427(87)901257.View ArticleMATHGoogle Scholar
 Stein B, Meyer zu Eissen S, Wißbrock F. On cluster validity and the information need of users. In: Hanza MH, editor. 3rd international conference on artificial intelligence and applications (AIA 03). Anaheim, Calgary, Zurich: ACTA Press; 2003. p. 216–21Google Scholar
 Wriggers P. Nonlinear finite element methods. Berlin: Springer; 2010.MATHGoogle Scholar