Estimation of error in observables of coarsegrained models of atomic systems
 John Tinsley Oden^{1}Email author,
 Kathryn Farrell^{1} and
 Danial Faghihi^{1}
https://doi.org/10.1186/s4032301500259
© Oden et al.; licensee Springer. 2015
Received: 24 November 2014
Accepted: 12 February 2015
Published: 3 May 2015
Abstract
Background
The use of coarsegrained approximations of atomic systems is the most common methods of constructing reducedorder models in computational science. However, the issue of central importance in developing these models is the accuracy with which they approximate key features of the atomistic system. Many methods have been proposed to calibrate coarsegrained models so that they qualitatively mimic the atomic systems, but these are often based on heuristic arguments.
Methods
A general framework for deriving a posteriori estimates of modeling error in coarse–grained models of key observables in atomistic systems is presented. Such estimates provide a new tool for model validation analysis. The connection of error estimates with relative information entropy of observables and model predictions is explained for socalled misspecified models. The relationship between model plausibilities and KullbackLeibler divergence between the true parameters and model predictions is summed up in several theorems.
Results
Numerical examples are presented in this paper involving a family of coarsegrained models of a polyethylene chain of united atom monomers. Numerical results suggest that the proposed methods of error estimation can be very good indications of the error inherent in coarsegrained models of observables in the atomistic systems. Also, new theorems relating the KullbackLeibler divergence between model predictions and observations to measures of model plausibility are presented.
Conclusions
A formal structure for estimating errors produced by coarsegraining atomistic models is presented. Numerical examples confirm that the estimates are in agreement with exact errors for a simple class of materials. Errors measured in the D _{ KL }divergence can be related to computable model plausibilities. The results should provide a powerful framework for assessing the validity and accuracy of coarsegrained models.
Keywords
Background
Coarsegrainedreduced order models
The most common method of constructing reducedorder models in all of computational science involves the use of coarsegrained models of atomic systems, whereby systems of atoms are aggregated into “beads”, or “super atoms”, or molecules to reduce the number of degrees of freedom and to lengthen the time scales in which the evolution of events are simulated.
The use of coarsegrained (CG) approximations has been prevalent in molecular dynamics (MD) simulations for many decades Comprehensive reviews of a large segment of the literature on CG models was recently given by Noid [1] and Li et al. [2], and an application to semiconductor nanomanufacturing is discussed in Farrell et al. [3]. The issue of central importance in developing CG models is the accuracy with which they approximate key features of the atomistic system. Many methods have been proposed to calibrate CG models so that they qualitatively mimic the allatom (AA) systems, but these are often based on heuristic arguments.
In this paper, we develop a posteriori estimates of error in CG approximations of observables in the AA system. We focus on standard molecular dynamics models of microcanonical ensemble (NVE) thermodynamics, and we call upon the theory of model adaptivity and error estimation laid down in [4] and [5]. In this particular setting, new estimates are also obtained when the information entropy of Shannon [6] is used as a quantity of interest. This leads to methods for estimating CGmodel parameters that involve the KullbackLeibler divergence between probability densities of observables in the AA and CG systems.
In the final Results and discussion section of this presentation, we review several statistical properties of parametric models, including asymptotic properties of misspecified models and generalizations of the Bernsteinvon Mises theorem advanced by Kleijn and van der Vaart [7]. There, the fundamental role of the KullbackLeibler distance (the D _{ KL }) between the true probability distribution and the observations accessible by the model is reviewed. We present results in the form of theorems that relate the D _{ KL } to measures of model plausibility that arise from Bayesian approaches to model selection. The relationships of the a posteriori estimates to the statistical interpretations are summarized in concluding remarks.
Preliminaries, conventions and notations
Here covalent bonds are represented by the harmonic potential (1a), changes in bond angles by (1b), torsional potentials by changes in dihedral angles (1c), LennardJones nonbonded potentials by (1d), with r _{ ij }=r _{ i }−r _{ j } and r _{ c } the cutoff radius, (α,β) typically = (12,6), and Coulomb potentials between charges q _{ i } at r _{ i } and q _{ j } at r _{ j } (1e). These forms are typical of those implemented in popular MD codes, although several other common potentials could be added. The parameters of the potential model are given by the vector of physical coefficients: {k _{ i },k _{ θ i },V _{ b l a c k j i },ϕ _{ i },ε _{ ij },σ _{ ij },r _{ c },… }. In general, atomic properties and values of parameters for the full allatom system are supplied by systems calibrated using experimental data or quantum mechanics predictions (see, e.g. the OPLS data in [8,9]).
where repeated indices are summed throughout their range, m _{ α β }=m _{(α)} δ _{ α β } is the mass of atom α, superimposed dots indicate time derivations, r _{ β i } is the component of r _{ β } in direction i, ∂ _{ α i }=∂/∂ r _{ α i } u(r ^{ n }(t)) is the total interatomic potential of the system given, e.g., by (2), and f _{ α i }(t) is the ith component of applied force on atom α at time t. We will add initial conditions, \(\dot {r}_{\beta i}(0) = v_{\beta i}\), and \(r_{\beta i}(0)=r^{0}_{\beta i}\), where v _{ β i } and \(r^{0}_{\beta i}\), for now, are assumed to be given.
Molecular dynamical equations of the form (5) are typical of those in standard MD codes that are numerically integrated with randomlysampled initial conditions over time intervals to approximate systems with constant energy and fixed volume and fixed number of particles corresponding to socalled microcanonical ensembles. Without loss in generality, we confine this development to such thermodynamic scenarios noting that straightforward extensions to, say, constant temperature settings, are covered by replacing (5) with appropriate “thermostat” models, such as the Langevin or Nose–Hoover formulations (see e.g. [10]). The general approach is then applicable to canonical ensembles and more general statistical thermodynamics settings.
the notation, in analogy with (1), being chosen to indicate parameters of the CG model.
It is important to establish a kinematic and algebraic relation between coordinates of particles in the AA system and those in the CG system. A very large literature exists on various coarsegraining mapping schemes, and choices of the appropriate map from the AA to the CG system or vice versa are often based on heuristic methods (see e.g. [2]). Our general approach can be adapted to any such welldefined AAtoCG or CGtoAA map, but for definiteness, we describe one such family of mappings.
Here we assume that each AA coordinate vector r _{ α } belongs to only one bead identified with CG vector R _{ A }, but this is not a necessary assumption. In the case of bonded systems in which r _{ α } is associated with, say, two index sets \(\mathcal {J}_{A}\) and \(\mathcal {J}_{B}\), we simply choose either \(\mathcal {J}_{A}\) or \(\mathcal {J}_{B}\) as the representative of r _{ α } and associate r _{ α } with only one bead to avoid double counting.
where G is the AAtoCG map defined in (11), where we denote r ^{ n }=G(R ^{ N }(t),θ), and where we specifically present the dependence of the QoI on the CG potential parameters θ.
with \(f_{\alpha i} = \omega ^{\beta }_{\alpha } G^{\cdot A}_{\beta } F_{Ai}\), and with repeated indices summed, 1≤α,β≤n,1≤A≤N.
where \(G^{\cdot \alpha }_{A}\) is the transpose of \(G^{\cdot A}_{\alpha }\) (no sum on α).
to define the image of this sample in the CG system.
Methods
Weak forms of the dynamical problem
The notation \(\mathcal {B(\cdot ; \cdot)}\) is intended to mean that \(\mathcal {B(\cdot ; \cdot)}\) is possibly nonlinear in entries to the left of the semicolon and linear in the entries to the right of it.
is equivalent to (8) in the sense that every solution of (8) with appropriate initial conditions, satisfies (22), and any sufficiantly smooth solution of (22) satisfies (8).
The adjoint problem
where Q is a functional on , and both \(\mathcal {B}'(\cdot ; \cdot)\) and Q ^{′}(·;·) are assumed to exist and be finite (i.e. \(\mathcal {B}(\cdot ; \cdot)\) and Q(·) are Gateaux differentiable). Then the adjoint or dual problem associated with (22) is
Theory of a posteriori estimation of modeling error
where \(\mathcal {B(\cdot ; \cdot)}\) is a semilinear form from \(\mathcal {V} \times \mathcal {V}\) into and is a linear functional on . Problem (30) is equivalent to the problem of finding a solution u of the problem A(u)=F in the dual space \(\mathcal {V}'\), where A is the map induced by \(\mathcal {B(\cdot ; \cdot)} : \langle A(u),v \rangle = \mathcal {B} (u ; v) = \mathcal {F}(v) = \langle \mathcal {F}, v \rangle \), 〈·;·〉 denoting duality pairing in \(\mathcal {V}' \times \mathcal {V}\). Assuming (30) is solvable for u, we wish to compute the value Q(u) of a functional \(Q: \mathcal {V} \rightarrow \mathbb {R}\) representing a quantity of interest, or an observable of interest.
with similar definitions of higherorder derivatives, e.g. \(\mathcal {B}^{\prime \prime }(u; w_{1}, w_{2},v)\), \(\mathcal {B}^{\prime \prime \prime }(u; w_{1}, w_{2}, w_{3}, v)\), Q ^{′′}(u;w,v), Q ^{′′}(u;v _{1},v _{2},v _{3}), etc. See [5] for details.
which, for each \(u_{0} \in \mathcal {V}\), is a linear functional on .
Obviously, if u _{0}=u, the solution of (30), \(\mathcal {R}(u;v)=0 \;\; \forall v \in \mathcal {V}\). Thus, \(\mathcal {R}(u_{0};v)\) describes the degree to which the vector u _{0} fails to satisfy the central problem (30).
We now recall the basic theorem in [5]:
Theorem 1.
where Δis a remainder involving higherorder terms in e _{0}=u−u _{0} and ε _{0}=z−z _{0}, z _{0} being an approximation of z.
An explicit form of Δ is given in the appendix.
This relation is the basis for many successful methods of a posteriori error estimation of both modeling error and numerical error. Whenever \(\mathcal {B}(\cdot ;\cdot)\) is a bilinear form and Q(·) is linear, Δ≡0.
A–posteriori estimation of error in CG approximations
The CG approximations of the “ground truth” AA system are characterized by a parametric class \(\mathcal {P}(\boldsymbol {\theta })\) of molecular dynamics models, one model corresponding to each choice of the vector θ in a space Θ of parameters defining the CG intermolecular potential U(R ^{ N }(t),θ). For a given value of θ, observables of interest in states of thermodynamic equilibrium of the CG system are typically generated as averages of samples of the observables taken over subintervals [t _{ k },t _{ k+1}]⊂[0,τ], for a distribution of initial conditions (see, e.g. [10]).
\(\ \cdot \_{\mathcal {V}'}\) being the norm on the dual space \(\mathcal {V}'\). The problem of error estimation thus reduces to one of developing efficient procedures to compute the residual (ρ) and to compute reasonable approximations of z ^{ n }.
It is clear that a quantitative estimate such as (36) (or an approximation with z ^{ n } replaced by Z ^{ n }) could be a powerful tool for determining validity of the CG model or in designing validation experiments for CG models. In theory, it also provides a basis for selecting optimal parameters for a given model so as to manage ε(θ). We elaborate on this notion in the final part of the Results and discussion section.
Results and discussion
Numerical example: estimation of error in CGapproximation of a polyethelyne chain
with the mass of each bead set to M=P m and the bond stiffness K=α k/P; \(\alpha \in \mathbb {R}^{+}\).
where U _{CG}(t) is the projection Π U(t) onto AA atom locations.
Δ being the blackremainder in (34). Since the forms in (43)(46) are linear in their respective arguments, the exact blackremainder Δ should be zero, but the error introduced by the numerical integration schemes employed generally leads to an additional numerical error Δ _{ Δ t }≠0. We employ a converted RungaKutta algorithm here to integrate (39), (41), and (47).
It is known that in general, the solution of the base model is not available. However, in order to show the effectiveness of the method presented here, the equations of motion for the united atom system is also solved in this example. Having the solution of the united atom model, u(t), the evolution of the exact ζ and estimated η _{ t } over time is shown in Figure 2d.
In general, the solution of the base model is not available, but the effectiveness of the method presented here is determined by comparing the CG solutions with the exact united atom model. The exact ζ and estimated η _{ t } over time are shown in Figure (2d).
Maximum entropy principle for atomic systems
where H(p,q) (\(=\int _{\mathbb {R}} p \log q \text {d}y\)) is the cross entropy and it is understood that \(0 \log \frac {0}{0}=0\) and \(0 \log \frac {0}{q}=0\).
Errors in information entropy
The specification (60) of the CG approximation (with ρ(r ^{ n }) as opposed to ρ(R ^{ N }(θ))) requires some explanation. In interpreting (60), one assumes the role of an observer who resides in the AA system and, instead of the true phase function q(r ^{ n }), observes a corrupted version for each choice of θ constrained to reside only in microstates accessible by the CGmodel. This is also the interpretation of the residual described in (13) and (15). It is also noted that the estimate (63) is reminiscent of the minimum relative entropy method suggested by Shell [13].
A fundamental question arises at this point: given estimates (36) or (63), is it possible to find a special parameter vector θ ^{∗} that makes the error ε(θ ^{∗})=0? This question is related to the socalled wellspecification or missspecification of the CG model. We believe the answer to this question is generally “no.”
Model misspecification and statistical analysis
A fundamental concept in the mathematical statistics literature on parametric models is the notion of a wellspecified model, one that has the property that a special parameter vector θ ^{∗} exists that the model maps into the truth; i.e. the true observational data. If no such parameter exists, the model is said to be misspecified.
More generally, we consider a space of physical observables (in our case, the values of appropriate observables sampled from the AA model) and a set \(\mathbb {M(\mathcal {Y})}\) of probability measures μ on . As always, a target quantity of interest \(Q:\mathbb {M} \rightarrow \mathbb {R}\) is identified (e.g. Q(μ)=μ[X≥a], X being a random variable and a a threshold value). We seek a particular measure μ ^{∗} which yields the “true” value of the quantity of interest Q(μ ^{∗}). We wish to predict Q(μ ^{∗}) using a parametric model \(\mathcal {P}: \Theta \rightarrow \mathbb {M}(\mathcal {Y})\), Θ being the space of parameters. Again, if a θ ^{∗}∈Θ exists such that \(\mathcal {P}(\boldsymbol {\theta }^{*})=\mu ^{*}\), the model is said to be wellspecified; otherwise, if \(\mu ^{*} \notin \mathcal {P}(\Theta)\), the model is misspecified. See, e.g., Geyer [14], Kleijn and van der Vaart [7], Freedman [15], Nickl [16]. In our model discussed in Section ‘Preliminaries, conventions and notations’, we seek a parameter θ ^{∗} of the CG model such that ε(θ ^{∗}) of (36) is zero, an unlikely possibility for most choices of Q.
where \(Z(\boldsymbol {\theta }) = \int _{\Theta } \pi (\mathbf {y}  \boldsymbol {\theta }) \pi (\boldsymbol {\theta }) \text {d}\boldsymbol {\theta }\) is the model evidence.

The Maximum Likelihood Estimate (MLE) is the parameter \(\hat {\boldsymbol {\theta }}^{n}\) that maximizes L _{ n }(θ):$$ \hat{\boldsymbol{\theta}}^{n} = \underset{\boldsymbol{\theta} \in \Theta}{\text{argmax}} \; L_{n}(\boldsymbol{\theta}). $$(68)

The Maximum A Posterior Estimate (MAP) is the parameter \(\tilde {\boldsymbol {\theta }}^{n}\) that maximizes the posterior pdf:$$ \tilde{\boldsymbol{\theta}}^{n} = \underset{\boldsymbol{\theta} \in \Theta}{\text{argmax}} \; \pi_{n}(\boldsymbol{\theta}  \mathbf{y}). $$(69)

The Bayesian Central Limit Theorem for wellspecified models under commonly satisfied smoothness assumptions (also called the Bernsteinvon Mises Theorem [7,16,17]) asserts that$$ \pi_{n}(\boldsymbol{\theta}  \mathbf{y}) \overset{\mathcal{P}}{\rightarrow} \mathcal{N}(\boldsymbol{\theta}^{*}; \mathbf{I}^{1}(\boldsymbol{\theta}^{*})), $$(70)where convergence is convergence in probability, \(\mathcal {N}(\boldsymbol {\mu }, \boldsymbol {\Sigma })\) denotes a normal distribution with mean μ and covariance matrix Σ, \(\hat {\boldsymbol {\theta }}\) is the generalized MLE, and I(θ) is the Fisher information matrix,$$ I_{ij}(\boldsymbol{\theta}) = \sum^{n}_{k=1} \left[ \frac{\partial^{2}}{\partial \theta_{i} \partial \theta_{j}} \log \pi({y}_{k}  \boldsymbol{\theta}) \right]_{\boldsymbol{\theta} = \boldsymbol{\theta}^{*}} $$(71)

Given a set of parametric models, \(\mathcal {M} = \{ \mathcal {P}_{1} (\boldsymbol {\theta }_{1}), \mathcal {P}_{2}(\boldsymbol {\theta }_{2}), \cdots, \mathcal {P}_{m} (\boldsymbol {\theta }_{m}) \}\), the posterior plausibility of model j is defined through the applications of Bayesian arguments by (see [3,18])$$ \rho_{j} = \pi(\mathcal{P}_{j}  \mathbf{y}, \mathcal{M}) = \frac{\int_{\Theta_{i}} \pi (\mathbf{y}  \boldsymbol{\theta}_{j}, \mathcal{P}_{j}, \mathcal{M}) \pi (\boldsymbol{\theta}_{j}  \mathcal{P}_{j}, \mathcal{M}) \text{d} \boldsymbol{\theta}_{j} \pi (\mathcal{P}_{j}  \mathcal{M})} {\pi(\mathbf{y}  \mathcal{M})} $$(72)
with \(\sum _{j=1}^{m} \rho _{j} = 1\), and the largest ρ _{ j }∈[0,1] corresponds to the most plausible model for data \(\mathbf {y} \in \mathcal {Y}\).
Finally, we come to the case of misspecified parametric models in which \(\mu ^{*} \notin \mathcal {P}(\Theta)\); i.e. no parameter θ ^{∗} exists such that the truth \(\mu ^{*} = \mathcal {P}(\boldsymbol {\theta }^{*})\). This situation, we believe, is by far the most common encountered in the use of CG models.
D _{ KL }(·∥·) being the KullbackLeibler distance defined in (54). By Jensen’s inequality (see, e.g. [16]), Q(θ ^{∗})≤Q(θ)∀θ∈Θ; i.e. θ ^{∗} is the minimizer of Q.
being a class of parametric models to which belongs.
where the negative selfentropy \(\int g \log g \; d\textbf {y}\) was eliminated since it does not depend on θ and therefore does not affect the optimization.
Plausibility D _{ KL } theory
it can be said that model \(\mathcal {P}_{1}\) is better than model \(\mathcal {P}_{2}\). The theorems presented here define the relationship between these two notions of model comparison.
where \(O_{12} = \pi (\mathcal {P}_{1}\mathcal {M})/\pi (\mathcal {P}_{2}\mathcal {M})\) is the prior odds and is often assumed to be one. With these assumptions in force, we present the following theorems.
Theorem 2.
Let (82) hold. If \(\mathcal {P}_{1}\) is more plausible than \(\mathcal {P}_{2}\) and O _{12}≤1, then (80) holds.
Proof.
By adding the quantity \(\int _{\mathcal {Y}^{n}} g \log g \; d\textbf {y}\) to both sides, the desired result (80) immediately follows.
does not necessarily need to be true for every point \(\textbf {y} \in \mathcal {Y}^{n}\).
Thus \(\mathcal {P}_{1}\) is more plausible than \(\mathcal {P}_{2}\) for given data \(\bar {\textbf {y}}\).
In summary, we have:
Theorem 3.
If \(D_{\textit {KL}}(g \ \pi (\textbf {y}  \boldsymbol {\theta }^{\dagger }_{1}, \mathcal {P}_{1}, \mathcal {M})) < D_{\textit {KL}}(g \ \pi (\textbf {y}  \boldsymbol {\theta }^{\dagger }_{2}, \mathcal {P}_{2},\mathcal {M}))\) and if \(\left  \mathcal {Y}^{n} \right  < \infty \) and if (90) holds, then there exists a \(\bar {\textbf {y}} \in \mathcal {Y}^{n}\) such that \(\mathcal {P}_{1}\) is more plausible than \(\mathcal {P}_{2}\), given that O _{12}≥1.
Conclusions
The formal structure of a posterior estimates for errors in quantities of interest in CG approximations of atomistic systems is given by (36) if the CG model is sufficiently close to the AA model in some sense, and this error depends upon the CG model parameter θ. Numerical experiments presented in Section ‘Numerical example: estimation of error in CGapproximation of a polyethelyne chain’ involving a family of CG models of a polyethylene chain of united atom monomers suggest that these estimates can be very good indications of the error inherent in CG models of observables in the AA system.
In section ‘Errors in information entropy’ an example of special interest arises in the comparison of the information entropy of AA and CG models. This leads to estimates (62) and (63) involving the KullbackLeibler divergence, D _{ KL }.
When the CG model is misspecified in a statistical sense, which is generally the case, the “ D _{ KL }distance” between the AA truth and the best possible approximation of any CG model is defined by choosing θ=θ ^{ † }, the minimizer of the D _{ KL } as indicated in (78). Under special assumptions, one can relate the D _{ KL } distance to Bayesian posterior model plausibility, as stated in our Theorem 2, which provides sufficient conditions for the most plausible model among a class of models to be in fact closest to the AA model in the D _{ KL } sense. The possible role of estimates such as (36), (62), and (63) in model validation should be noted.
The marginalization of the righthand side of this relation is the model evidence, which serves as a likelihood function for a higher level of Bayes’s rule. The corresponding posteriors are the model plausibilities of (72). We remark that the notion of model plausibilities is an extension of the idea of Bayes factors prevalent in statistic literature (see e.g. [12] for discussion of the ideas) and was introduced to the best of our knowledge in [18]. The development of algorithms involving Bayesian plausibilities to study model selection in CG models of complex atomic system is discussed in [3,20].
It has been demonstrated, the most plausible model in a set will, under stated assumptions, involve parameters that minimize the D _{ KL }−distance between the model and the socalled truth parameters. Whether that “best” model is valid for the intended purpose depends on tolerances set of error in key observables, the QoIs of the validation scenario (see [3]).
Appendix
The theory and estimates reduce to finite element a posterior error estimates in the special case in which u _{0}=u _{ h } and z _{0}=z _{ h } are finite element approximation of solutions (u,z) to partial differential equations (see e.g. [21]).
Declarations
Acknowledgements
This material is based upon work supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Number DE5C0009286. The authors benefited from suggestions of Eric Wright, who read an early draft of this work.
Authors’ Affiliations
References
 Noid WG (2013) Perspective: coarsegrained models for biomolecular systems. J Chem Phys 139: 090901.Google Scholar
 Li Y, Abberton BC, Kroger M, Liu WK (2013) Challenges in multiscale modeling of polymer dynamics. Polymers 5(2): 751–832. doi:10.3390/polym5020751.View ArticleGoogle Scholar
 Farrell K, Oden JT (2014) Calibration and validation of coarsegrained models of atomic systems: Application to semiconductor manufacturing. Comput Mech 54(1): 3–19. doi:10.1007/s004660141028y.View ArticleMATHGoogle Scholar
 Oden JT, Prudhomme S (2002) Estimation of modeling error in computational mechanics. J Comput Phys 182(2): 496–515.View ArticleMATHMathSciNetGoogle Scholar
 Oden JT, Prudhomme S, Romkes A, Bauman PT (2006) Multiscale modeling of physical phenomena: adaptive control of models. SIAM J Sci Comput 28(6): 2359–2389.View ArticleMATHMathSciNetGoogle Scholar
 Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27: 379–423623656.View ArticleMATHMathSciNetGoogle Scholar
 Kleijn BJK, van der Vaart A (2002) The asymptotics of misspecified bayesian statistics. In: Mikosch T Janzura M (eds)Proceedings of the 24th European Meeting of Statisticians.. Prague, Czech Republic.Google Scholar
 Jorgensen WL, TiradoRives J (1988) The OPLS potential functions for proteins. Energy minimizations for crystals of cyclic peptides and crambin. J Am Chem Soc 110(6): 1657–1666.View ArticleGoogle Scholar
 Jorgensen WL, Maxwell DS, TiradoRives J (1996) Development and testing of the OPLS allatom force field on conformational energetics and properties of organic liquids. J Am Chem Soc 118(45): 11225–11236.View ArticleGoogle Scholar
 Frenkel D, Smit B (2001) Understanding molecular simulation: from Algorithms to applications, Computational science. 2nd edn, Vol. 1. Academic Press, San Diego.Google Scholar
 Haile JM (1997) Molecular dynamics simulation. John Wiley and Sons, NY.Google Scholar
 Jaynes ET (2003) Probability theory: the logic of science. Cambridge University Press, Cambridge.View ArticleGoogle Scholar
 Shell MS (2008) The relative entropy is fundamental to multiscale and inverse thermodynamic problems. J Chem Phys 129(14): 144108.Google Scholar
 Geyer CJ (2003). 5601 Notes: the sandwich estimator. School of Statistics, University of Minnesota.Google Scholar
 Freedman DA (2006) On the socalled “Huber sandwich estimator” and “robust standard errors”. Am Stat 34: 299–302.View ArticleGoogle Scholar
 Nickl R (2012). sTATISTICAL THEORY. Statistical Laboratory, Department of Pure Mathematics and Mathematical Statistics, University of Cambridge.Google Scholar
 Kleijn BJK (2004). Bayesian asymptotics under misspecification. PhD thesis, Free University Amsterdam.Google Scholar
 Beck JL, Yuan KV (2004) Model selection using response measurements: Bayesian probabilistic approach. J Eng Mech 130(2): 192–203.View ArticleGoogle Scholar
 Kleijn BJK (2012) van der Vaart AW (2012) The BernsteinvonMises theorem under misspecification. Electronic J Stat 6: 354–381. doi:10.1214/12EJS675.View ArticleMATHMathSciNetGoogle Scholar
 Farrell K, Oden JT, Faghihi D (2015) A Bayesian framework for adaptive selection, calibration, and validation of coarsegrained models of atomistic systems. J Comput Phys 295: 189–208. ISSN 00219991.View ArticleGoogle Scholar
 Becker R, Rannacher R (2001) An optimal control approach to a posteriori error estimation in finite element methods. Acta Numerica 10: 1–102. doi:10.1017/S0962492901000010.View ArticleMATHMathSciNetGoogle Scholar
Copyright
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.