Abstract
In this paper, we propose an R package, called RKHSMetaMod, that implements a procedure for estimating a meta-model of a complex model. The meta-model approximates the Hoeffding decomposition of the complex model and allows us to perform sensitivity analysis on it. It belongs to a reproducing kernel Hilbert space that is constructed as a direct sum of Hilbert spaces. The estimator of the meta-model is the solution of a penalized empirical least-squares minimization with the sum of the Hilbert norm and the empirical \(L^2\)-norm. This procedure, called RKHS ridge group sparse, allows both to select and estimate the terms in the Hoeffding decomposition, and therefore, to select and estimate the Sobol indices that are non-zero. The RKHSMetaMod package provides an interface from the R statistical computing environment to the C++ libraries Eigen and GSL. In order to speed up the execution time and optimize the storage memory, except for a function that is written in R, all of the functions of this package are written using the efficient C++ libraries through RcppEigen and RcppGSL packages. These functions are then interfaced in the R environment in order to propose a user-friendly package.Consider a phenomenon described by a model \(m\) depending on \(d\) input variables \(X=(X_1,...,X_d)\). This model \(m\) from \(\mathbb{R}^d\) to \(\mathbb{R}\) may be a known model that is calculable in all points of \(X\), i.e. \(Y=m(X)\), or it may be an unknown regression model defined as follows: \[\begin{aligned} \label{grm} Y=m(X)+\varepsilon, \end{aligned} (\#eq:grm)\] where the error \(\varepsilon\) is assumed to be centered with a finite variance, i.e. \(E(\varepsilon)=0\) and \(var(\varepsilon)<\infty\). The components of \(X\) are independent with a known law \(P_X=\prod_{a=1}^dP_{X_a}\) on \(\mathcal{X}\), a subset of \(\mathbb{R}^d\). The number \(d\) of components of \(X\) may be large. The model \(m\) may present high complexity as strong non-linearities and high order interaction effects, and it is assumed to be square-integrable, i.e. \(m\in L^2(\mathcal{X},P_X)\). Based on the data points \(\{(X_i,Y_i)\}_{i=1}^n\), we estimate a meta-model that approximates the Hoeffding decomposition of \(m\). This meta-model belongs to a reproducing kernel Hilbert space (RKHS), which is constructed as a direct sum of the Hilbert spaces leading to an additive decomposition including variables and interactions between them (Durrande et al. 2013). The estimator of the meta-model is calculated by minimizing an empirical least-squares criterion penalized by the sum of two penalty terms: the Hilbert norm and the empirical norm (Huet and Taupin 2017). This procedure allows us to select the subsets of variables \(X_1,...,X_d\) that contribute to predict \(Y\). The estimated meta-model is used to perform sensitivity analysis, and therefore, to determine the influence of each variable and groups of them on the output variable \(Y\).
In the classical framework of the sensitivity analysis, the function
\(m\) is calculable in all points of
\(X\), and one may use the method of
Sobol (1993) to perform the sensitivity
analysis on \(m\). Let us briefly
introduce this method:
The independency between the components of \(X\) leads to write the function \(m\) according to its Hoeffding
decomposition (Sobol 1993; Van der Vaart
1998): \[\begin{aligned}
\label{sobol1}
m(X)=m_0+\sum_{a=1}^dm_a(X_a)+\sum_{a<a'}m_{a,a'}(X_a,X_{a'})+...+m_{1,...,d}(X).
\end{aligned} (\#eq:sobol1)\] The terms in this decomposition
are defined using the conditional expected values: \[\begin{aligned}
m_0&=E_X(m(X)),\;
m_a(X_a)=E_X(m(X)|X_a)-m_0; \\
m_{a,a'}(X_a,X_{a'})&=E_X(m(X)|X_a,X_{a'})-m_a(X_a)-m_{a'}(X_{a'})-m_0,
\cdots
\end{aligned}\] These terms are known as the constant term, main
effects, interactions of order two and so on. Let \(\mathcal{P}\) be the set of all subsets of
\(\{1,...,d\}\) with dimension \(1\) to \(d\). For all \(v\in\mathcal{P}\) and \(X\in\mathcal{X}\), let \(X_v\) be the vector with components \(X_a\), \(a\in
v\). For a set \(A\) let \(\vert A\vert\) be its cardinality, and for
all \(v\in\mathcal{P}\), let \(m_v:\mathbb{R}^{\vert v\vert}\rightarrow
\mathbb{R}\) be the function associated with \(X_v\) in Equation (@ref(eq:sobol1)). Then
Equation (@ref(eq:sobol1)) can be expressed as follows: \[\begin{aligned}
\label{ch3sobol}
m(X)=m_0+\sum_{v\in\mathcal{P}}m_v(X_v).
\end{aligned} (\#eq:ch3sobol)\] This decomposition is unique,
all terms \(m_v\), \(v\in\mathcal{P}\) are centered, and they
are orthogonal with respect to \(L^2(\mathcal{X},P_X)\). The functions \(m\) and \(m_v\), \(v\in\mathcal{P}\) in Equation
(@ref(eq:ch3sobol)) are square-integrable. As any two terms of
decomposition (@ref(eq:ch3sobol)) are orthogonal, by squaring
(@ref(eq:ch3sobol)) and integrating it with respect to the distribution
of \(X\), a decomposition of the
variance of \(m(X)\) is obtained as
follows: \[\begin{aligned}
\label{chpacvardecop}
\text{var}(m(X))=\sum_{v\in\mathcal{P}}\text{var}(m_v(X_v)).
\end{aligned} (\#eq:chpacvardecop)\] The Sobol indices
associated with the group of variables \(X_v\), \(v \in
\mathcal{P}\) are defined by: \[\begin{aligned}
\label{trusob}
S_v=\text{var}(m_v(X_v))/\text{var}(m(X)).
\end{aligned} (\#eq:trusob)\] For each \(v\), the \(S_v\) expresses the fraction of the
variance of \(m(X)\) explained by \(X_v\). For all \(v\in\mathcal{P}\), when \(\vert v\vert=1\), the \(S_v\)s are referred to as the first order
indices; when \(\vert v\vert =2\),
i.e. \(v=\{a,a'\}\) and \(a\neq a'\), they are referred to as the
second order indices or the interaction indices of order two (between
\(X_a\) and \(X_{a'}\)); and the same holds for \(\vert v\vert>2\).
The total number of the Sobol indices to be calculated is equal to
\(\vert\mathcal{P}\vert=2^d-1\), which
raises exponentially with the number of the input variables \(d\). When \(d\) is large, the evaluation of all the
indices can be computationally demanding and even not reachable. In
practice, only the indices of order not higher than two are calculated.
However, only the first and second order indices may not provide a good
information on the model sensitivities. In order to provide better
information on the model sensitivities, Homma and
Saltelli (1996) proposed to calculate the first order and the
total indices defined as follows:
Let \(\mathcal{P}_a\subset\mathcal{P}\)
be the set of all the subsets of \(\{1,...,d\}\) including \(a\), then \(S_{T_a}=\sum_{v\in\mathcal{P}_a}S_v\). For
all \(a\in\{1,...,d\}\), the \(S_{T_a}\) denotes the total effect of \(X_a\). It expresses the fraction of
variance of \(m(X)\) explained by \(X_a\) alone and all the interactions of it
with the other variables. The total indices allow us to rank the input
variables with respect to the amount of their effect on the output
variable. However, they do not provide complete information on the model
sensitivities as do all the Sobol indices.
The classical computation of the Sobol indices is based on the Monte Carlo methods (see for example: Sobol (1993) for the main effect and interaction indices, and Andrea Saltelli (2002) for the main effect and total indices). For models that are expensive to evaluate, the Monte Carlo methods lead to a high computational burden. Moreover, in the case where \(d\) is large, \(m\) is complex and the calculation of the variances (see Equation (@ref(eq:chpacvardecop))) is numerically complicated or not possible (as in the case where the model \(m\) is unknown) the methods described above are not applicable. Another approach consists in approximating \(m\) by a simplified model, called a meta-model, which is much faster to evaluate and to perform sensitivity analysis on it. Beside the approximations of the Sobol indices of \(m\) at a lower computational cost, a meta-model provides a deeper view of the input variables effects on the model output. Among the meta-modelling methods proposed in the literature, the expansion based on the polynomial Chaos (Wiener 1938; Schoutens 2000) can be used to approximate the Hoeffding decomposition of \(m\) (Sudret 2008). The principle of the polynomial Chaos is to project \(m\) onto a basis of orthonormal polynomials. The polynomial Chaos expansion of \(m\) is written as (Soize and Ghanem 2004): \[\begin{aligned} \label{ch3pc} m(X)=\sum_{j=0}^\infty h_j\phi_j(X), \end{aligned} (\#eq:ch3pc)\] where \(\{h_j\}_{j=0}^\infty\) are the coefficients, and \(\{\phi_j\}_{j=0}^\infty\) are the multivariate orthonormal polynomials associated with \(X\) which are determined according to the distribution of the components of \(X\). In practice, expansion (@ref(eq:ch3pc)) shall be truncated for computational purposes, and the model \(m\) may be approximated by \(\sum_{j=0}^{v_{max}} h_j\phi_j(X)\), where \(v_{max}\) is determined using a truncation scheme. The Sobol indices are obtained then by summing up the squares of the suitable coefficients. Blatman and Sudret (2011) proposed a method for truncating the polynomial Chaos expansion and an algorithm based on the least angle regression for selecting the terms in the expansion. In this approach, according to the distribution of the components of \(X\), a unique family of orthonormal polynomials \(\{\phi_j\}_{j=0}^\infty\) is determined. However, this family may not be necessarily the best functional basis to approximate \(m\) well.
Gaussian Process (GP) can also be used to construct meta-models as highlighted in Welch et al. (1992), Oakley and O’Hagan (2004), Kleijnen (2007, 2009), Marrel et al. (2009), Durrande, Ginsbourger, and Roustant (2012), and Loic. Le Gratiet, Cannamela, and Iooss (2014). The principle is to consider that the prior knowledge about the function \(m(X)\), can be modelled by a GP \(\mathcal{Z}(X)\) with a mean \(m_{\mathcal{Z}}(X)\) and a covariance kernel \(k_{\mathcal{Z}}(X,X')\). To perform sensitivity analysis from a GP model, one may replace the model \(m(X)\) with the mean of the conditional GP and deduce the Sobol indices from it. A review on the meta-modelling based on the polynomial Chaos and the GP is presented in L. Le Gratiet, Marelli, and Sudret (2017).
Durrande et al. (2013) considered a class of the functional approximation methods similar to the GP and obtained a meta-model that satisfies the properties of the Hoeffding decomposition. They proposed to approximate \(m\) by functions belonging to a RKHS \(\mathcal{H}\) which is a direct sum of the Hilbert spaces. Their RKHS \(\mathcal{H}\) is constructed in a way that the projection of \(m\) onto \(\mathcal{H}\), denoted \(f^*\), is an approximation of the Hoeffding decomposition of \(m\). The function \(f^*\) is defined as the minimizer over the functions \(f\in\mathcal{H}\) of the criterion \(E_X(m(X)-f(X))^2.\)
Let \(\langle.,.\rangle\mathcal{H}\) be the scalar product in \(\mathcal{H}\), let also \(k\) and \(k_v\) be the reproducing kernels associated with the RKHS \(\mathcal{H}\) and the RKHS \(\mathcal{H}_v\) respectively. The properties of the RKHS \(\mathcal{H}\) insures that any function \(f\in\mathcal{H}\), \(f:\mathcal{X}\subset \mathbb{R}^d\rightarrow\mathbb{R}\) is written as the following decomposition: \[\begin{aligned} \label{durandhoeff} f(X)=\langle f,k(X,.)\rangle{\mathcal{H}}=f_0+\sum_{v\in\mathcal{P}}f_v(X_v), \end{aligned} (\#eq:durandhoeff)\] where \(f_0\) is constant, and \(f_v:\mathbb{R}^{\vert v\vert}\rightarrow \mathbb{R}\) is defined by \(f_v(X)=\langle f,k_v(X,.)\rangle_{\mathcal{H}}\). For more details on the RKHS construction and the definition of the Hilbert norm see Section "RKHS construction" in the Appendix (supplementary materials).
For all \(v\in\mathcal{P}\), the functions \(f_v(X_v)\) are centered and for all \(v\neq v'\), the functions \(f_v(X_v)\) and \(f_{v'}(X_{v'})\) are orthogonal with respect to \(L^2(\mathcal{X},P_X)\). Therefore, the decomposition of the function \(f\) presented in Equation (@ref(eq:durandhoeff)) is its Hoeffding decomposition. As the function \(f^*\) belongs to the RKHS \(\mathcal{H}\), it is decomposed as its Hoeffding decomposition, \(f^*=f^*_0+\sum_{v\in\mathcal{P}}f^*_v\), and each function \(f^*_v\) approximates the function \(m_v\) in Equation (@ref(eq:ch3sobol)). The number of the terms \(f^*_v\) that should be estimated in the Hoeffding decomposition of \(f^*\) is equal to \(\vert\mathcal{P}\vert=2^d-1\), which may be huge since it rises very quickly by increasing \(d\). In order to deal with this problem, in the regression framework, one may estimate \(f^*\) by a sparse meta-model \(\widehat{f}\in\mathcal{H}\). To this end, the estimation of \(f^*\) is done on the basis of \(n\) observations by minimizing a least-squares criterion suitably penalized in order to deal with both the non-parametric nature of the problem and the possibly large number of functions that have to be estimated. In the classical framework of the sensitivity analysis one may calculate a sparse approximation of \(f^*\) using least-squares penalized criterion as it is done in the non-parametric regression framework. In order to obtain a sparse solution of a minimization problem, the penalty function should enforce the sparsity. There exists various ways of enforcing sparsity for a minimization (maximization) problem, see for example Hastie, Tibshirani, and Wainwright (2015) for a review. Some methods, such as the Sparse Additive Models (SpAM) procedure (Ravikumar et al. 2009; Liu, Wasserman, and Lafferty 2009) are based on a combination of the \(l_1\)-norm with the empirical \(L^2\)-norm: \(\Vert f\Vert_{n,1}=\sum_{a=1}^d\Vert f_a\Vert_n,\) where \(\Vert f_a\Vert_n^2=\sum_{i=1}^nf_a^2(X_{ai})/n,\) is the squared empirical \(L^2\)-norm of the univariate function \(f_a\). The Component Selection and Smoothing Operator (COSSO) method developed by Lin and Zhang (2006) enforces sparsity using a combination of the \(l_1\)-norm with the Hilbert norm: \(\Vert f\Vert_{\mathcal{H},1}=\sum_{a=1}^d\Vert f_a\Vert_{\mathcal{H}_a}\). Instead of focusing on only one penalty term, one may consider a more general family of estimators, called the doubly penalized estimator, which is obtained by minimizing a criterion penalized by the sum of two penalty terms. Raskutti, Wainwright, and Yu (2009, 2012) proposed a doubly penalized estimator, which is the solution of the minimization of a least-squares criterion penalized by the sum of a sparsity penalty term and a combination of the \(l_1\)-norm with the Hilbert norm: \[\begin{aligned} \label{doubpen} \gamma\Vert f\Vert_{n,1}+\mu\Vert f\Vert_{\mathcal{H},1}, \end{aligned} (\#eq:doubpen)\] where \(\gamma, \mu\in\mathbb{R}\) are the tuning parameters that should be suitably chosen.
Meier, Geer, and Bühlmann (2009) proposed a related family of estimators, based on the penalization with the empirical \(L^2\)-norm. Their penalty function is the sum of the sparsity penalty term, \(\Vert f\Vert_{n,1}\), and a smoothness penalty term. Huet and Taupin (2017) considered the same approximation functional spaces as Durrande et al. (2013), and obtained a doubly penalized estimator of a meta-model which approximates the Hoeffding decomposition of \(m\). Their estimator is the solution of the least-squares minimization penalized by the penalty function defined in Equation (@ref(eq:doubpen)) adapted to the multivariate setting, \[\begin{aligned} \label{ch3pen} \gamma\Vert f\Vert_{n}+\mu\Vert f\Vert_{\mathcal{H}}, with \Vert f\Vert_{n}=\sum_{v\in\mathcal{P}}\Vert f_v\Vert_{n},\:\Vert f\Vert_{\mathcal{H}}=\sum_{v\in\mathcal{P}}\Vert f_v\Vert_{\mathcal{H}_v}. \end{aligned} (\#eq:ch3pen)\] This procedure, called RKHS ridge group sparse, estimates the groups \(v\) that are suitable for predicting \(f^*\), and the relationship between \(f^*_v\) and \(X_v\) for each group. The obtained estimator, called RKHS meta-model, is used then to estimate the Sobol indices of \(m\). This approach renders it possible to estimate the Sobol indices for all groups in the support of the RKHS meta-model, including the interactions of possibly high order, a point known to be difficult in practice.
In this paper, we introduce an R package, called RKHSMetaMod, that implements the RKHS ridge group sparse procedure. The functions of this package allows us to:
The current version of the package supports uniformly distributed input variables on \(\mathcal{X}=[0,1]^d\). However, it could be easily adapted to datasets with input variables from another distribution by making a small modification to one of its functions (see Remark 3 of Section Calculation of the Gram matrices).
Let us give a brief overview of the related existing statistical
packages to the RKHSMetaMod package. The R package sensitivity
is designed to implement sensitivity analysis methods and provides the
approaches for numerical calculation of the Sobol indices. In
particular, Kriging method can be used to reduce the number of the
observations in global sensitivity analysis. The function
sobolGP of the package sensitivity builds a
Kriging based meta-model using the function km of the
package DiceKriging
(Roustant, Ginsbourger, and Deville 2012),
and estimates its Sobol indices. This procedure can also be done using
the function km and the function fast99 of the
package sensitivity (see Section 4.5. of Roustant, Ginsbourger, and Deville (2012)). In
this case, the idea is once again to build a Kriging based meta-model
using the function km and then estimate its Sobol indices
using the function fast99. In both cases the true function
is substituted by a Kriging based meta-model and then its Sobol indices
are estimated. In the sobolGP function, the Sobol indices
are estimated by the Monte Carlo integration, while the
fast99 function estimates them using the extended-FAST
method (A. Saltelli, Tarantola, and Chan
1999). To reduce the computational burden when dealing with large
datasets and complex models, in RKHSMetaMod package, we propose
to use the empirical variances to estimate the Sobol indices (see
Section Estimation of the Sobol indices).
Besides, the estimation of the Sobol indices in the RKHSMetaMod
package is done based on the RKHS meta-model which is a sparse
estimator. It is beneficial since instead of calculating the Sobol
indices of all groups \(v\in\mathcal{P}\), only the Sobol indices
associated with the groups in the support of the RKHS meta-model are
computed (see Section Estimation of the Sobol
indices). Moreover, the functions sobolGP and
fast99 provide the estimation of the first order and the
total Sobol indices only, while the procedure in the
RKHSMetaMod package makes it possible to estimate the high
order Sobol indices. The R packages DiceKriging and DiceOptim
(Deep Inside Computer Experiments Kriging/Optim) (Roustant, Ginsbourger, and Deville 2012)
implement the Kriging based meta-models to estimate complex models in
the high dimensional context. They provide different GP (Kriging) models
corresponding to the Gaussian, Matérn, Exponential and Power-Exponential
correlation functions. The estimation of the parameters of the
correlation functions in these packages relies on the global optimizer
with gradient genoud algorithm of the package rgenoud
(Mebane and Sekhon 2011). These packages
do not implement any method of the sensitivity analysis themselves.
However, some authors (see Section 4.5. of Roustant, Ginsbourger, and Deville (2012) for
example) perform sensitivity analysis on their estimated meta-models by
employing the functions of the package sensitivity. The R
package RobustGaSP
(Robust Gaussian Stochastic Process) (Gu, Palomo,
and Berger 2019) approximates a complex model by a GP meta-model.
This package implements marginal posterior mode estimation of the GP
model parameters. The estimation method in this package insures the
robustness of the parameter estimation in the GP model, and allows one
also to identify input variables that have no effect on the variability
of the function under study. The R package mlegp
(maximum likelihood estimates of Gaussian processes) (Dancik and Dorman 2008) provides functions to
implement both meta-modelling approaches and sensitivity analysis
methods. It obtains maximum likelihood estimates of the GP model for the
output of costly computer experiments. The GP models are built either on
the basis of the Gaussian correlation function or on the basis of the
first degree polynomial trend. The sensitivity analysis methods
implemented in this package include Functional Analysis of Variance
(FANOVA) decomposition, plot functions to obtain diagnostic plots, main
effects, and second order interactions. The prediction quality of the
meta-model depends on the quality of the estimation of its parameters
and more precisely the estimation of parameters in the correlation
functions (Kennedy and O’Hagan 2000). The
maximum likelihood estimation of these parameters often produce unstable
results, and as a consequence, the obtained meta-model may have an
inferior prediction quality (Gu, Wang, and Berger
2018; Gu 2019). The RKHSMetaMod package is devoted to
the meta-model estimation on the RKHS \(\mathcal{H}\). It implements the convex
optimization algorithms to calculate meta-models; provides the functions
to compute the prediction error of the obtained meta-models; performs
the sensitivity analysis on the obtained meta-models and more precisely
calculate their Sobol indices. The convex optimization algorithms used
in this package are all written using C++ libraries, and are adapted to
take into account the problem of high dimensionality in this context.
This package is available from the Comprehensive R Archive Network
(CRAN) (Kamari 2019).
The organization of the paper is as follows: In the next Section, we describe the estimation method. In Section Algorithms, we present in details the algorithms used in the RKHSMetaMod package. Section RKHSMetaMod through examples includes two parts: In the first part, Section Simulation study, the performance of the RKHSMetaMod package functions is validated through a simulation study. In the second part, Section Comparison examples, the comparison in terms of the predictive accuracy between the RKHS meta-model and the Kriging based meta-models from RobustGaSP (Gu, Palomo, and Berger 2019) and DiceKriging (Roustant, Ginsbourger, and Deville 2012) packages is given through two examples.
In this Section, we present: the RKHS ridge group sparse and the RKHS group lasso procedures (see RKHS ridge group sparse and RKHS group lasso procedures), the strategy of choosing the tuning parameters in the RKHS ridge group sparse algorithm (see Choice of the tuning parameters), and the calculation of the empirical Sobol indices of the RKHS meta-model (see Estimation of the Sobol indices).
Let us denote by \(n\) the number of observations. The dataset consists of a vector of \(n\) observations \(Y=(Y_1,...,Y_n)\), and a \(n\times d\) matrix of features \(X\) with components \((X_{ai},i=1,...,n,a=1,...,d)\in\mathbb{R}^{n\times d}.\) For some tuning parameters \(\gamma_v\), \(\mu_v\), \(v\in\mathcal{P}\), the RKHS ridge group sparse criterion is defined by, \[\begin{aligned} \label{functional} \mathcal{L}(f)=\frac{1}{n}\sum_{i=1}^n\Big(Y_i-f_0-\sum_{v\in\mathcal{P}}f_v(X_{vi}) \Big)^2 +\sum_{v\in\mathcal{P}}\gamma_v\Vert f_v\Vert_n+\sum_{v\in\mathcal{P}}\mu_v\Vert f_v\Vert_{\mathcal{H}_v} , \end{aligned} (\#eq:functional)\] where \(X_v\) represents the matrix of variables corresponding to the \(v\)-th group, i.e. \(X_v=(X_{vi},i=1,...,n,v\in\mathcal{P})\in\mathbb{R}^{n\times \vert \mathcal{P}\vert},\) and where \(\Vert f_v\Vert_n\) is the empirical \(L^2\)-norm of \(f_v\) defined by the sample \(\{X_{vi}\}_{i=1}^n\) as \(\Vert f_v\Vert_n=\sqrt{\sum_{i=1}^nf_v^2(X_{vi})/n}.\)
The penalty function in the criterion (@ref(eq:functional)) is the
sum of the Hilbert norm and the empirical norm, which allows us to
select few terms in the additive decomposition of \(f\) over sets \(v
\in \mathcal{P}\). Moreover, the Hilbert norm favours the
smoothness of the estimated \(f_v\),
\(v \in \mathcal{P}\).
Let \(\mathcal{F}= \{ f:\: f=f_{0} +
\sum_{v\in \mathcal{P}} f_{v},\: f_v \in \mathcal{H}_{v},\:
\|f_v\|_{\mathcal{H}_v} \leq r_v,\:r_v\in\mathbb{R}^{+} \}\) be
the set of functions. Then the RKHS meta-model is defined by, \[\label{ch3prediction}
\widehat{f}=\mathop{\mathrm{arg\,min}}_{f\in\mathcal{F}}\mathcal{L}(f). (\#eq:ch3prediction)\]
According to the Representer Theorem (Kimeldorf
and Wahba 1970), the non-parametric functional minimization
problem described above is equivalent to a parametric minimization
problem. Indeed, the solution of the minimization problem
(@ref(eq:ch3prediction)) belonging to the RKHS \(\mathcal{H}\) is written as \(f=f_0+\sum_{v\in\mathcal{P}}f_v\), where
for some matrix \(\theta=(\theta_{vi},i=1,...,n,v\in\mathcal{P})\in\mathbb{R}^{n\times\vert\mathcal{P}\vert}\)
we have for all \(v\in\mathcal{P}\),
\[\begin{aligned}
\label{normhvdef}
f_v(.)=\sum_{i=1}^n\theta_{vi}k_v(X_{vi},.), and \Vert
f_v\Vert^2_{\mathcal{H}_v}=\sum_{i,i'=1}^n\theta_{vi}\theta_{vi'}k_v(X_{vi},X_{vi'}).
\end{aligned} (\#eq:normhvdef)\] Let \(\Vert.\Vert\) be the Euclidean norm (called
also \(L^2\)-norm) in \(\mathbb{R}^n\), and for each \(v\in\mathcal{P}\), let \(K_v\) be the \(n\times n\) Gram matrix associated with the
kernel \(k_v(.,.)\), i.e. \((K_v)_{i,i'}=k_v(X_{vi},X_{vi'})\).
Let also \(K^{1/2}\) be the matrix that
satisfies \(t(K^{1/2})K^{1/2}=K\), and
let \(\widehat{f}_0\) and \(\widehat{\theta}\) be the minimizers of the
following penalized least-squares criterion: \[\begin{aligned}
C(f_0,\theta)=\Vert Y-f_0
I_n-\sum_{v\in\mathcal{P}}K_v\theta_v\Vert^2+\sqrt{n}\sum_{v\in\mathcal{P}}\gamma_v\Vert
K_v\theta_v\Vert+n\sum_{v\in\mathcal{P}}\mu_v\Vert
K_v^{1/2}\theta_v\Vert.
\end{aligned}\] Then the estimator \(\widehat{f}\) defined in Equation
(@ref(eq:ch3prediction)) satisfies, \[\begin{aligned}
\widehat{f}(X)=\widehat{f}_0+\sum_{v\in\mathcal{P}}\widehat{f}_v(X_v)
with
\widehat{f}_v(X_v)=\sum_{i=1}^n\widehat{\theta}_{vi}k_v(X_{vi},X_v).
\end{aligned}\]
Remark 1. The constraint \(\Vert f_v\Vert_{\mathcal{H}_v}\leq r_v\) is crucial for theoretical properties, but the value of \(r_v\) is generally unknown and has no practical usefulness. In this package, it is not taken into account in the parametric minimization problem.
For each \(v\in\mathcal{P}\), let \(\gamma'_v\) and \(\mu'_v\) be the weights that are chosen suitably. We define \(\gamma_v=\gamma\times\gamma'_v\) and \(\mu_v=\mu\times\mu'_v\) with \(\gamma,\mu\in\mathbb{R}^+\).
Remark 2. This formulation simplifies the choice of the tuning parameters since instead of tuning \(2\times\vert\mathcal{P}\vert\) parameters \(\gamma_v\) and \(\mu_v\), \(v\in\mathcal{P}\), only two parameters \(\gamma\) and \(\mu\) are tuned. Moreover, the weights \(\gamma'_v\) and \(\mu'_v\), \(v\in\mathcal{P}\) may be of interest in practice. For example, one can take weights that increase with the cardinal of \(v\) in order to favour the effects with small interaction order between variables.
For the sake of simplicity, in the rest of this paper for all \(v\in\mathcal{P}\) the weights \(\gamma'_v\) and \(\mu'_v\) are assumed to be set as one, and the RKHS ridge group sparse criterion is then expressed as follows: \[\begin{aligned} \label{parametric} C(f_0,\theta)=\Vert Y-f_0 I_n-\sum_{v\in\mathcal{P}}K_v\theta_v\Vert^2+\sqrt{n}\gamma\sum_{v\in\mathcal{P}}\Vert K_v\theta_v\Vert+n\mu\sum_{v\in\mathcal{P}}\Vert K_v^{1/2}\theta_v\Vert. \end{aligned} (\#eq:parametric)\] If we consider only the second part of the penalty function in the criterion (@ref(eq:parametric)) ( i.e. set \(\gamma=0\)), we obtain the RKHS group lasso criterion as follows: \[\begin{aligned} \label{Lasso} C_g(f_0,\theta)=\Vert Y-f_0I_n-\sum_{v\in\mathcal{P}}K_v\theta_v\Vert^2+n\mu\sum_{v\in\mathcal{P}}\Vert K_v^{1/2}\theta_v\Vert, \end{aligned} (\#eq:Lasso)\] which is a group lasso criterion (Yuan and Lin 2006) up to a scale transformation.
In the RKHSMetaMod package, the RKHS ridge group sparse algorithm is initialized using the solutions obtained by solving the RKHS group lasso algorithm. Indeed, the penalty function in the RKHS group lasso criterion (@ref(eq:Lasso)) insures the sparsity in the solution. Therefore, for a given value of \(\mu\), by implementing the RKHS group lasso algorithm (see Section RKHS group lasso), a RKHS meta-model with few terms in its additive decomposition is obtained. The support and the coefficients of a RKHS meta-model which is obtained by implementing the RKHS group lasso algorithm will be denoted by \(\widehat{S}_{\widehat{f}_{\text{Group Lasso}}}\) and \(\widehat{\theta}_{\text{Group Lasso}}\), respectively. From now on, we denote the tuning parameter in the RKHS group lasso criterion by: \[\label{tungrp} \mu_g=\sqrt{n}\mu. (\#eq:tungrp)\]
While dealing with an optimization problem of a criterion of the form (@ref(eq:parametric)), one of the essential steps is to choose the appropriate tuning parameters. Cross-validation is generally used for that purpose. Nevertheless in the context of high-dimensional complex models, the computational time for a cross-validation procedure may be prohibitively high. Therefore, we propose a procedure based on a single testing dataset:
In the RKHSMetaMod package, the algorithm to calculate a
sequence of the RKHS meta-models, the value of \(\mu_{\text{max}}\), and the prediction
error are implemented as RKHSMetMod, mu\(\_\)max, and
PredErr functions, respectively. These functions are
described in Section "Overview of the RKHSMetaMod functions"
(supplementary materials), and illustrated in Example 1, Example 2, and Examples 1, 2, 3,
respectively.
The variance of the function \(m\)
is estimated by the variance of the estimator \(\widehat{f}\). As the estimator \(\widehat{f}\) belongs to the RKHS \(\mathcal{H}\), it admits the Hoeffding
decomposition and, \[\begin{aligned}
\text{var}(\widehat{f}(X))=\sum_{v\in\mathcal{P}}\text{var}(\widehat{f}_v(X_v)),
where \forall v\in\mathcal{P},\:
\text{var}(\widehat{f}_v(X_v))=E_X({\widehat{f}_v}^2(X_v))=\Vert
\widehat{f}_v\Vert_2^2.
\end{aligned}\] In order to reduce the computational cost in
practice, one may estimate the variances of \(\widehat{f}_v(X_v)\), \(v\in\mathcal{P}\) by their empirical
variances. Let \(\widehat{f}_{v.}\) be
the empirical mean of \(\widehat{f}_v(X_{vi})\), \(i=1,...,n\), then: \[\begin{aligned}
\widehat{\text{var}}(\widehat{f}_v(X_v))=\frac{1}{n-1}\sum_{i=1}^n(\widehat{f}_v(X_{vi})-\widehat{f}_{v.})^2.
\end{aligned}\] For the groups \(v\) that do not belong to the support of
\(\widehat{f}\), we have \(\widehat{S}_v=0\) and for the groups \(v\) that belong to the support of \(\widehat{f}\), the estimators of the Sobol
indices of \(m\) are defined by, \[\begin{aligned}
\widehat{S}_v=\widehat{\text{var}}(\widehat{f}_v(X_v))/\sum_{v\in\mathcal{P}}\widehat{\text{var}}(\widehat{f}_v(X_v)).
\end{aligned}\] In the RKHSMetaMod package, the
algorithm to calculate the empirical Sobol indices \(\widehat{S}_v\), \(v\in\mathcal{P}\) is implemented as
SI\(\_\)emp
function. This function is described in Section "Companion functions"
(supplementary materials) and illustrated in Examples 1, 2, 3.
The RKHSMetaMod package implements two optimization algorithms: the RKHS ridge group sparse (see Algorithm 2) and the RKHS group lasso (see Algorithm 1). These algorithms rely on the Gram matrices \(K_v\), \(v\in\mathcal{P}\) that have to be positive definite. Therefore, the first and essential step in this package is to calculate these matrices and insure their positive definiteness. The algorithm of this step is described in the next Section. The second step is to estimate the RKHS meta-model. In the RKHSMetaMod package, two different objectives based on different procedures are considered to calculate this estimator:
The available kernels in the RKHSMetaMod package are: Gaussian kernel, Matérn \(3/2\) kernel, Brownian kernel, quadratic kernel and linear kernel. The usual presentation of these kernels is given in Table 1.
| Kernel type | Mathematical formula for \(u\in\mathbb{R}^n,v\in\mathbb{R}\) | RKHSMetaMod name |
|---|---|---|
| Gaussian | \(k_a(u,v)=\exp(-\Vert u-v\Vert^2/2r^2)\) | "gaussian" |
| Matérn 3/2 | \(k_a(u,v)=(1+\sqrt{3}\vert u-v\vert/r)\exp(-\sqrt{3}\vert u-v\vert/r)\) | "matern" |
| Brownian | \(k_a(u,v)=\min(u,v)+1\) | "brownian" |
| Quadratic | \(k_a(u,v)=(u^Tv+1)^2\) | "quad" |
| Linear | \(k_a(u,v)=u^Tv+1\) | "linear" |
The choice of kernel, that is done by the user, determines the
functional approximation space. For a chosen kernel, the algorithm to
calculate the Gram matrices \(K_v\),
\(v\in\mathcal{P}\) in the
RKHSMetaMod package, is implemented as calc\(\_\)Kv function. This
algorithm is based on three essential points:
Modify the chosen kernel:
In order to satisfy the conditions of constructing the RKHS \(\mathcal{H}\) described in Section "RKHS
construction" of the Appendix (supplementary materials), these kernels
are modified according to Equation "(2)" (see the Appendix
(supplementary materials)). Let us take the example of the Brownian
kernel:
The RKHS associated with the Brownian kernel \(k_a(X_a,X_a')=\min(X_a,X_a')+1\) is
well known to be \(\mathcal{H}_a=\{f:[0,1]\rightarrow \mathbb{R} is
absolutely continuous, and
f(0)=0,\:\int_0^1{f'(X_a)}^2dX_a<\infty\},\) with the
inner product \(\langle
f,h\rangle_{\mathcal{H}_a}=\int_0^1f'(X_a)h'(X_a)dX_a.\)
Easy calculations lead to obtain the Brownian kernel as follows, \[\begin{aligned}
k_{0a}=\min(X_a,X_a')+1-(3/4)(1+X_a-{X_a^2/2})(1+X_a'-{X_a'^2/2}).
\end{aligned}\] The RKHS associated with kernel \(k_{0a}\) is the set \(\mathcal{H}_{0a}=\{f\in\mathcal{H}_a:\:\int_{0}^1f(X_a)dX_a=0\}\),
and we have \(\mathcal{H}= \mathbb{1} +
\sum_{v \in \mathcal{P}}
\mathcal{H}_{v}=\{f:[0,1]^d\rightarrow\mathbb{R}:\:f=f_0+\sum_{v\in\mathcal{P}}f_v(X_v),
with f_v\in\mathcal{H}_v\}.\)
Remark 3. In the current version of the package,
we consider the input variables \(X=(X_1,...,X_d)\) that are uniformly
distributed on \([0,1]^d\). In order to
consider the input variables that are not distributed uniformly, it
suffices to modify a part of the function calc\(\_\)Kv related to the
calculation of the kernels \(k_{0a}\),
\(a=1,...,d\). For example, for \(X=(X_1,...,X_d)\) being distributed with
law \(P_X=\prod_{a=1}^d P_a\) on \(\mathcal{X}=\bigotimes_{a=1}^d\mathcal{X}_a\subset\mathbb{R}^d\),
the kernel \(k_{0a}\) associated with
the Brownian kernel is calculated as follows, \[\begin{aligned}
k_{0a}&=\min
(X_a,X_a')+1-\frac{(\int_{\mathcal{X}_a}(\min(X_a,U)+1)dP_a)(\int_{\mathcal{X}_a}(\min(X_a',U)+1)dP_a)}{(\int_{\mathcal{X}_a}\int_{\mathcal{X}_a}(\min(U,V)+1)dP_adP_a)}.
\end{aligned}\] The other parts of function
calc\(\_\)Kv
remain unchanged.
Calculate the Gram matrices \(K_v\) for all \(v\):
First, for all \(a=1,...,d\), the Gram
matrices \(K_a\) associated with
kernels \(k_{0a}\) are calculated using
Equation "(2)" (see the Appendix (supplementary materials)), \((K_a)_{i,i'}=k_{0a}(X_{ai},X_{ai'}).\)
Then, for all \(v\in\mathcal{P}\), the
Gram matrices \(K_v\) associated with
kernel \(k_v=\prod_{a\in v}k_{0a}\) are
computed by \(K_v=\bigodot_{a\in v}
K_a\).
Insure the positive definiteness of the matrices \(K_v\):
The output of function calc\(\_\)Kv is one of the input
arguments of the functions associated with the RKHS group lasso and the
RKHS ridge group sparse algorithms. Throughout these algorithms we need
to calculate the inverse and the square root of the matrices \(K_v\). In order to avoid the numerical
problems and insure the invertibility of the matrices \(K_v\), it is mandatory to have these
matrices positive definite. One way to render the matrices \(K_v\) positive definite is to add a nugget
effect to them. That is, to modify matrices \(K_v\) by adding a diagonal with a constant
term, i.e. \(K_v+epsilon\times I_n\).
The value of \(epsilon\) is computed
based on the data and through a part of the algorithm of the function
calc\(\_\)kv.
Let us briefly explain this part of the algorithm:
For each group \(v\in\mathcal{P}\), let
\(\lambda_{v,i},i=1,...,n\) be the
eigenvalues associated with the matrix \(K_v\). Set \(\lambda_{v,\text{max}}={\text{max}}_{i}\lambda_{v,i}\)
and \(\lambda_{v,\text{min}}={\text{min}}_{i}\lambda_{v,i}\).
For some fixed value of tolerance tol, and for each matrix
\(K_v\), if "\(\lambda_{v,\text{min}} <
\lambda_{v,\text{max}}\times\)tol", then, the
eigenvalues of \(K_v\) are replaced by
\(\lambda_{v,i}+\)epsilon,
with epsilon being equal to \(\lambda_{v,\text{max}}\times\)tol.
The value of tol is set as \(1e^{-8}\) by default, but one may consider
a smaller or a greater value for it depending on the kernel chosen and
the value of \(n\).
The function calc\(\_\)Kv is described in Section
"Companion functions" (supplementary materials) and illustrated in
Example 2.
The RKHS meta-model is the solution of one of the optimization problems: the minimization of the RKHS group lasso criterion presented in Equation (@ref(eq:Lasso)) (if \(\gamma=0\)), or the minimization of the RKHS ridge group sparse criterion presented in Equation (@ref(eq:parametric)) (if \(\gamma\neq0\)). In the following, the algorithms to solve these optimization problems are presented.
A popular technique for doing group wise variable selection is group lasso. With this procedure, depending on the value of the tuning parameter \(\mu\), an entire group of predictors may drop out of the model. An efficient algorithm for solving group lasso problem is the classical block coordinate descent algorithm (Boyd et al. 2011; Bubeck 2015). Following the idea of Fu (1998), Yuan and Lin (2006) implemented a block wise descent algorithm for the group lasso penalized least-squares under the condition that the model matrices in each group are orthonormal. A block coordinate (gradient) descent algorithm for solving the group lasso penalized logistic regression is then developed by Meier, Geer, and Bühlmann (2008). This algorithm is implemented in the R package grplasso available from CRAN (Meier 2020). Yang and Zou (2015) proposed a unified algorithm named group wise majorization descent for solving the general group lasso learning problems by assuming that the loss function satisfies a quadratic majorization condition. The implementation of their work is done in the gglasso R package available from CRAN (Yang, Zou, and Bhatnagar 2020).
In order to solve the RKHS group lasso optimization problem, we use the classical block coordinate descent algorithm. The minimization of criterion \(C_g(f_0,\theta)\) (see Equation (@ref(eq:Lasso))) is done along each group \(v\) at a time. At each step of the algorithm, the criterion \(C_g(f_0,\theta)\) is minimized as a function of the current block’s parameters, while the parameters values for the other blocks are fixed to their current values. The procedure is repeated until convergence. This procedure leads to Algorithm 1 (see the Appendix (supplementary materials) for more details on this procedure).
In the RKHSMetaMod package, the Algorithm 1 is implemented as RKHSgrplasso
function. This function is described in Section "Companion functions"
(supplementary materials) and illustrated in Example 2.
In order to solve the RKHS ridge group sparse optimization problem, we propose an adapted block coordinate descent algorithm. This algorithm is provided in two steps:
This procedure leads to Algorithm 2 (see the Appendix (supplementary materials) for more details on this procedure).
In the RKHSMetaMod package the Algorithm 2 is implemented as pen\(\_\)MetMod function. This
function is described in Section "Companion functions" (supplementary
materials) and illustrated in Example 2.
By considering some prior information about the data, one may be interested in a RKHS meta-model \(\widehat{f}\) with the number of groups in its support not greater than some "\(qmax\)". In order to obtain such an estimator, we provide the following procedure in the RKHSMetaMod package:
This procedure leads to Algorithm 3.
This algorithm is implemented in the RKHSMetaMod package, as
function RKHSMetMod\(\_\)qmax (see Section "Main
RKHSMetaMod functions" (supplementary materials) for more details on
this function).
Remark 4. As both terms in the penalty function of criterion (@ref(eq:parametric)) enforce sparsity to the solution, the estimator obtained by solving the RKHS ridge group sparse associated with the pair of the tuning parameters \((\mu_{qmax},\gamma>0)\) may contain a smaller number of groups than the solution of the RKHS group lasso optimization problem (i.e. the RKHS ridge group sparse with \((\mu_{qmax},\gamma=0)\)). And therefore, the estimated RKHS meta-model contains at most "\(qmax\)" groups in its support.
Let us consider the g-function of Sobol (A. Saltelli, Chan, and Scott 2009) in the Gaussian regression framework, i.e. \(Y=m(X)+\varepsilon\). The error term \(\varepsilon\) is a centered Gaussian random variable with variance \(\sigma^2\), and \(m\) is the g-function of Sobol defined over \([0,1]^d\) by, \[\begin{aligned} \label{gfct} m(X)=\prod_{a=1}^d\frac{\vert 4X_a-2\vert +c_a}{1+c_a},\:c_a>0 . \end{aligned} (\#eq:gfct)\] The Sobol indices of the g-function can be expressed analytically: \[\begin{aligned} \forall v\in\mathcal{P},\:S_v=\frac{1}{D}\prod_{a\in v}D_a,\:D_a=\frac{1}{3(1+c_a)^2},\:D=\prod_{a=1}^d(D_a+1)-1. \end{aligned}\] Set \(c_1=0.2\), \(c_2=0.6\), \(c_3=0.8\) and \((c_a)_{a>3}=100\). With these values of coefficients \(c_a\), the variables \(X_1, X_2\) and \(X_3\) explain \(99.98\%\) of the variance of function \(m(X)\) (see Table 4).
In this Section, three examples are presented. In all examples, the
value of Dmax is set as three. Example 1 illustrates
the use of the RKHSMetMod function by considering three
different kernels, "matern", "brownian", and
"gaussian" (see Table 1), and three
datasets of \(n\in\{50,100,200\}\)
observations and \(d=5\) input
variables. The larger datasets with \(n\in\{1000,2000,5000\}\) observations and
\(d=10\) input variables are studied in
Examples 2 and 3. In each
example, two independent datasets are generated: \((X,Y)\) to estimate the meta-models, and
\((XT,YT)\) to estimate the prediction
errors. The design matrices \(X\) and
\(XT\) are the Latin Hypercube Samples
of the input variables that are generated using maximinLHS
function of the package lhs available
at CRAN (Carnell 2021):
library(lhs); X <- maximinLHS(n, d); XT <- maximinLHS(n, d)
The response variables \(Y\) and \(YT\) are calculated as \(Y=m(X)+\varepsilon\) and \(YT=m(XT)+\varepsilon_T\), where \(\varepsilon\), and \(\varepsilon_T\) are centered Gaussian random variables with \(\sigma^2=(0.2)^2\):
a <- c(0.2, 0.6, 0.8, 100, 100, 100, 100, 100, 100, 100)[1:d]
g=1; for (i in 1:d) g = g*(abs(4*X[,i]-2)+a[i])/(1+a[i])
sigma <- 0.2
epsilon <- rnorm(n, 0, sigma^2); Y <- g + epsilon
gT=1; for (i in 1:d) gT = gT*(abs(4*XT[,i]-2)+a[i])/(1+a[i])
epsilonT <- rnorm(n, 0, sigma^2); YT <- gT + epsilonT
Example 1. RKHS meta-model estimation using
RKHSMetMod function:
In this example, three datasets of \(n\) points maximinLHS over
\([0,1]^d\) with \(n\in\{50,100,200\}\) and \(d=5\) are generated, and a grid of five
values of tuning parameters \(\mu\) and
\(\gamma\) is considered as follows:
\[\mu_{(1:5)}={\mu_{max}/(\sqrt{n}\times
2^{(2:6)})},\quad \gamma_{(1:5)}=(0.2,0.1,0.01,0.005,0).\] For
each dataset, the experiment is repeated \(N_r=50\) times. At each repetition, the
RKHS meta-models associated with the pair of tuning parameters \((\mu,\gamma)\) are estimated using the
RKHSMetMod function:
Dmax <- 3; kernel <- "matern" # kernel <- "brownian" # kernel <- "gaussian"
gamma <- c(0.2, 0.1, 0.01, 0.005, 0); frc <- 1/(0.5^(2:6))
res <- RKHSMetMod(Y, X, kernel, Dmax, gamma, frc, FALSE)
These meta-models are evaluated using a testing dataset. The
prediction errors are computed for them using the PredErr
function. The RKHS meta-model with minimum prediction error is chosen to
be the best estimator for the model. Finally, the Sobol indices
are computed for the best RKHS meta-model using the function
SI\(\_\)emp:
Err <- PredErr(X, XT, YT, mu, gamma, res, kernel, Dmax)
SI <- SI_emp(res, Err)
The vector mu is the values of the tuning parameter
\(\mu\) that are calculated throughout
the function RKHSMetMod. It could be recovered from the
output of the RKHSMetMod function as follows:
mu <- vector()
l <- length(gamma); for(i in 1:length(frc)){mu[i] <- res[[(i-1)*l+1]]$mu}
The performance of this method for estimating a meta-model is evaluated by considering a third dataset \((m(X^{third}_i),X_i^{third})\), \(i=1,...,N\), with \(N=1000\). The global prediction error is calculated as follows:
Let \(\widehat{f}_r(.)\) be the best RKHS meta-model obtained in the repetition \(r\), \(r=1,...,N_r\), then \[\begin{aligned} GPE=\frac{1}{N_{r}}\sum_{r=1}^{N_{r}}\frac{1}{N}\sum_{i=1}^N(\widehat{f}_r(X^{third}_i)-m(X^{third}_i))^2. \end{aligned}\] The values of \(GPE\) obtained for different kernels and values of \(n\) are given in Table 2.
| \(n\) | \(50\) | \(100\) | \(200\) |
|---|---|---|---|
| \(GPE_m\) | \(0.13\) | \(0.07\) | \(0.03\) |
| \(GPE_b\) | \(0.14\) | \(0.10\) | \(0.05\) |
| \(GPE_g\) | \(0.15\) | \(0.10\) | \(0.07\) |
As expected the value of \(GPE\)
decreases as \(n\) increases. The
lowest values of \(GPE\) are obtained
when using the "matern" kernel.
In order to sum up the behaviour of the procedure for estimating the Sobol indices, we consider the mean square error (MSE) criterion obtained by \(\sum_v(\sum_{r=1}^{N_r}(\widehat{S}_{v,r}-S_{v})^2/{N_r})\), where for each group \(v\), \(S_v\) denotes the true values of the Sobol indices, and \(\widehat{S}_{v,r}\) is the empirical Sobol indices of the best RKHS meta-model in repetition \(r\). The obtained values of MSE for different kernels and values of \(n\) are given in Table 3.
| \(n\) | \(50\) | \(100\) | \(200\) |
|---|---|---|---|
| \(MSE_m\) | \(75.12\) | \(46.72\) | \(28.22\) |
| \(MSE_b\) | \(110.71\) | \(84.99\) | \(41.06\) |
| \(MSE_g\) | \(78.22\) | \(94.67\) | \(67.02\) |
As expected, the values of MSE are smaller for larger values of \(n\). The smallest values are obtained when
using the "matern" kernel.
The means of the empirical Sobol indices of the best RKHS
meta-models through all repetitions for \(n=200\) and "matern" kernel
are displayed in Table 4.
| \(v\) | \(\{1\}\) | \(\{2\}\) | \(\{3\}\) | \(\{1,2\}\) | \(\{1,3\}\) | \(\{2,3\}\) | \(\{1,2,3\}\) | sum |
|---|---|---|---|---|---|---|---|---|
| \(S_v\) | 43.30 | 24.30 | 19.20 | 5.63 | 4.45 | 2.50 | 0.57 | 99.98 |
| \(\widehat{S}_{v,.}\) | 46.10 | 26.33 | 20.62 | 2.99 | 2.22 | 1.13 | 0.0 | 99.39 |
It appears that the estimated Sobol indices are close to the true ones, nevertheless, they are overestimated for the main effects, i.e. groups \(v\in\{\{1\},\{2\},\{3\}\}\), and underestimated for the interactions of order two and three, i.e. groups \(v\in\{\{1,2\},\{1,3\},\{2,3\},\{1,2,3\}\}\). Note that the strategy of choosing the tuning parameters is based on the minimization of the prediction error of the estimated meta-model, which may not minimize the error of estimating the Sobol indices.
Taking into account the results obtained for this Example 1, the calculations in the rest of the examples is done
using only the "matern" kernel.
Example 2. A time-efficient strategy to obtain the "optimal" tuning parameters when dealing with large datasets:
A dataset of \(n\) points
maximinLHS over \([0,1]^d\) with \(n=1000\) and \(d=10\) is generated. First, we use
functions calc\(\_\)Kv and
mu\(\_\)max
to compute the eigenvalues and eigenvectors of the positive definite
matrices \(K_v\), and the value of
\(\mu_{max}\), respectively:
kernel <- "matern"; Dmax <- 3
Kv <- calc_Kv(X, kernel, Dmax, TRUE, TRUE)
mumax <- mu_max(Y, Kv$kv)
Then, we consider the two following steps:
Set \(\gamma=0\) and, \(\mu_{(1:9)}=\mu_{max}/(\sqrt{n}\times
2^{(2:10)}).\) Calculate the RKHS meta-models associated with the
values of \(\mu_g=\mu\times\sqrt{n}\)
by using the function RKHSgrplasso. Gather the obtained
RKHS meta-models in a list, res\(\_\)g (while this job could be
done using the function RKHSMetMod by setting \(\gamma=0\), in this example, we use the
function RKHSgrplasso in order to avoid the re-calculation
of \(K_v\)’s at the next step).
Thereafter, for each estimator in res\(\_\)g, the prediction error is
calculated by considering a new dataset and using the function
PredErr. The value of \(\mu\) with the smallest error of prediction
in this step is denoted by \(\mu_{i}\).
Let us implement this step:
For a grid of values of \(\mu_g\), a
sequence of the RKHS meta-models are calculated and gathered in the
res\(\_\)g
list:
mu_g <- c(mumax*0.5^(2:10))
res_g <- list(); resg <- list()
for(i in 1:length(mu_g)){
resg[[i]] <- RKHSgrplasso(Y, Kv, mu_g[i], 1000, FALSE)
res_g[[i]] <- list("mu_g"=mu_g, "gamma"=0, "MetaModel"=resg[[i]])
}
```
Output `res`$\_$`g` contains nine RKHS meta-models and they are
evaluated using a testing dataset:
``` r
gamma <- c(0); Err_g <- PredErr(X, XT, YT, mu_g, gamma, res_g, kernel, Dmax)
The prediction errors of the RKHS meta-models obtained in this step are displayed in Table 5.
| \(\mu_g\) | \(1.304\) | \(0.652\) | \(0.326\) | \(0.163\) | \(0.081\) | \(0.041\) | \(0.020\) | \(0.010\) | \(0.005\) |
|---|---|---|---|---|---|---|---|---|---|
| \(\gamma=0\) | \(0.197\) | \(0.156\) | \(0.145\) | \(0.097\) | \(0.063\) | \(0.055\) | \(0.056\) | \(0.063\) | \(0.073\) |
It appears that the minimum prediction error corresponds to the solution of the RKHS group lasso algorithm with \(\mu_g=0.041\), so \(\mu_i=0.041/\sqrt{n}\).
Choose a smaller grid of values of \(\mu\), \((\mu_{(i-1)},\mu_i,\mu_{(i+1)})\), and set
a grid of values of \(\gamma>0\).
Calculate the RKHS meta-models associated with each pair of the tuning
parameters \((\mu,\gamma)\) by the
function pen\(\_\)MetMod. Calculate the
prediction errors for the new sequence of the RKHS meta-models using the
function PredErr. Compute the empirical Sobol indices for
the best estimator. Let us go back to the implementation of the
example and apply this step \(2\):
The grid of the values of \(\mu\) in
this step is \((0.081,0.041,0.020)/\sqrt{n}.\) The RKHS
meta-models associated with this grid of the values of \(\mu\) are gathered in a new list
resgnew. We set \(\gamma_{(1:4)}=(0.2, 0.1, 0.01, 0.005)\),
and we calculate the RKHS meta-models for this new grid of the values of
\((\mu,\gamma)\) using
pen\(\_\)MetMod function:
gamma <- c(0.2, 0.1, 0.01, 0.005); mu <- c(mu_g[5], mu_g[6], mu_g[7])/sqrt(n)
resgnew <- list()
resgnew[[1]] <- resg[[5]]; resgnew[[2]] <- resg[[6]]; resgnew[[3]] <- resg[[7]]
res <- pen_MetMod(Y, Kv, gamma, mu, resgnew, 0, 0)
The output res is a list of twelve RKHS meta-models.
These meta-models are evaluated using a new dataset, and their
prediction errors are displayed in Table 6.
| \(\mu\) | \(0.081/\sqrt{n}\) | \(0.041/\sqrt{n}\) | \(0.020/\sqrt{n}\) |
|---|---|---|---|
| \(\gamma=0.2\) | \(0.153\) | \(0.131\) | \(0.119\) |
| \(\gamma=0.1\) | \(0.098\) | \(0.079\) | \(0.072\) |
| \(\gamma=0.01\) | \(0.065\) | \(0.054\) | \(0.053\) |
| \(\gamma=0.005\) | \(0.064\) | \(0.054\) | \(0.054\) |
The minimum prediction error is associated with the pair \((0.020/\sqrt{n},0.01)\), and the best RKHS meta-model is then \(\widehat{f}_{(0.020/\sqrt{n},0.01)}\).
The performance of this procedure for estimating the Sobol indices is
evaluated using the relative error (RE) defined as follows:
For each \(v\), let \(S_v\) be the true value of the Sobol
indices displayed in Table 4 and \(\widehat{S}_v\) be the estimated empirical
Sobol indices. Then \[\label{relerr}
RE=\sum_v{\vert\widehat{S}_v-S_v\vert/S_v}. (\#eq:relerr)\] In
Table 7 the estimated empirical Sobol indices,
their sum, and the value of RE are displayed.
| \(v\) | \(\{1\}\) | \(\{2\}\) | \(\{3\}\) | \(\{1,2\}\) | \(\{1,3\}\) | \(\{2,3\}\) | \(\{1,2,3\}\) | sum | RE |
|---|---|---|---|---|---|---|---|---|---|
| \(\widehat{S}_v\) | \(42.91\) | \(25.50\) | \(20.81\) | \(4.40\) | \(3.84\) | \(2.13\) | \(0.00\) | \(99.60\) | \(1.64\) |
The obtained RE for each group \(v\) is smaller than \(1.64\%\), therefore, the estimated Sobol indices in this example are very close to the true values of the Sobol indices displayed in the first row of Table 4.
Example 3. Dealing with larger datasets:
Two datasets of \(n\) points
maximinLHS over \([0,1]^d\) with \(n\in\{2000,5000\}\) and \(d=10\) are generated. In order to obtain
one RKHS meta-model associated with one pair of the tuning parameters
\((\mu,\gamma)\), the number of
coefficients to be estimated is equal to \(n\times\)vMax\(=n\times 175\). Table 8 gives the execution time for different functions
used throughout the Examples 1-3. In all examples we used a cluster of computers with:
2 Intel Xeon E5-2690 processors (2.90GHz) and 96Gb Ram (6x16Gb of memory
1600MHz).
| \((n,d)\) | calc\(\_\)Kv |
mu\(\_\)max |
RKHSgrplasso |
pen\(\_\)MetMod |
\(\vert S_{\widehat{f}}\vert\) | sum |
|---|---|---|---|---|---|---|
| (100,5) | 0.09s | 0.01s | 1s | 2s | 18 | \(\sim\) 3s |
| 2s | 3s | 19 | \(\sim\) 5s | |||
| (500,10) | 33s | 9s | 247s | 333s | 39 | \(\sim\) 10min |
| 599s | 816s | 64 | \(\sim\) 24min | |||
| (1000,10) | 197s | 53s | 959s | 1336s | 24 | \(\sim\) 42min |
| 2757s | 4345s | 69 | \(\sim\) 2h | |||
| (2000,10) | 1498s | 420s | 3984s | 4664s | 12 | \(\sim\) 2h:56min |
| 12951s | 22385s | 30 | \(\sim\) 10h:20min | |||
| (5000,10) | 34282s | 6684s | 38957s | 49987s | 11 | \(\sim\) 36h:05min |
| 99221s | 111376s | 15 | \(\sim\) 69h:52min |
As we can see, the execution time increases fast as \(n\) increases. In Figure 1 the plot of the logarithm of the time (in
seconds) versus the logarithm of \(n\)
is displayed for the functions calc\(\_\)Kv, mu\(\_\)max,
RKHSgrplasso and pen\(\_\)MetMod.
RKHSgrplasso and pen\(\_\)MetMod is displayed for
two pairs of values of the tuning parameters \((\mu_1=\mu_{max}/(\sqrt{n}\times
2^7),\gamma=0.01)\) in solid lines, and \((\mu_2=\mu_{max}/(\sqrt{n}\times
2^8),\gamma=0.01)\) in dashed lines.It appears that, the algorithms of these functions are of polynomial
time \(O(n^\alpha)\) with \(\alpha\backsimeq3\) for the functions
calc\(\_\)Kv
and mu\(\_\)max, and \(\alpha\backsimeq2\) for the functions
RKHSgrplasso and pen\(\_\)MetMod.
Taking into account the results obtained for the prediction error and
the values of \((\widehat{\mu},\widehat{\gamma})\) in
Example 2, in this example, only two values of the
tuning parameter \(\mu_{(1:12)}=\mu_{max}/(\sqrt{n}\times2^{(7:8)})\),
and one value of the tuning parameter \(\gamma=0.01\) are considered. The RKHS
meta-models associated with the pair of values \((\mu_i,\gamma)\), \(i=1,2\) are estimated using the
RKHSMetMod function:
kernel <- "matern"; Dmax <- 3
gamma <- c(0.01); frc <- 1/(0.5^(7:8))
res <- RKHSMetMod(Y, X, kernel, Dmax, gamma, frc, FALSE)
The prediction error and the empirical Sobol indices are then
calculated for the obtained meta-models using the functions
PredErr and SI\(\_\)emp:
mu <- vector(); mu[1] <- res[[1]]$mu; mu[2] <- res[[2]]$mu
Err <- PredErr(X, XT, YT, mu, gamma, res, kernel, Dmax)
SI <- SI_emp(res, NULL)
Table 9 gives the estimated empirical Sobol indices as well as their sum, the values of RE (see Equation (@ref(eq:relerr))), and the prediction errors associated with the obtained estimators.
| \(n\) | \(v\) | \(\{1\}\) | \(\{2\}\) | \(\{3\}\) | \(\{1,2\}\) | \(\{1,3\}\) | \(\{2,3\}\) | \(\{1,2,3\}\) | sum | RE | Err |
|---|---|---|---|---|---|---|---|---|---|---|---|
| \(2000\) | \(\widehat{S}_{v;(\mu_1,\gamma)}\) | \(45.54\) | \(24.78\) | \(21.01\) | \(3.96\) | \(3.03\) | \(1.65\) | \(0.00\) | \(99.97\) | \(2.12\) | \(0.052\) |
| \(\widehat{S}_{v;(\mu_2,\gamma)}\) | \(45.38\) | \(25.07\) | \(19.69\) | \(4.36\) | \(3.66\) | \(1.79\) | \(0.00\) | \(99.95\) | \(1.79\) | \(0.049\) | |
| \(5000\) | \(\widehat{S}_{v;(\mu_1,\gamma)}\) | \(44.77\) | \(25.39\) | \(20.05\) | \(4.49\) | \(3.38\) | \(1.90\) | \(0.00\) | \(99.98\) | \(1.81\) | \(0.049\) |
| \(\widehat{S}_{v;(\mu_2,\gamma)}\) | \(43.78\) | \(24.99\) | \(19.56\) | \(5.43\) | \(3.90\) | \(2.32\) | \(0.00\) | \(99.98\) | \(1.29\) | \(0.047\) |
For \(n=5000\) we obtained the smaller values of RE and prediction error (Err). So as expected, the estimation of the Sobol indices as well as the prediction quality are better for larger values of \(n\).
In Figure 2 the result of the estimation quality and the Sobol indices for the dataset with \(n\) equal to \(5000\), \(d\) equal to \(10\), and \((\mu_2,\gamma)\) are displayed.
The line \(y = x\) in red crosses the cloud of points as long as the values of the g-function are smaller than three. When the values of the g-function are greater than three, the estimator \(\widehat{f}\) tends to underestimate the g-function. Concerning the Sobol indices obtained by the estimator \(\widehat{f}\), as illustrated in the right-hand plot, with the exception of groups \(\{1\}\), \(\{2\}\), \(\{3\}\), \(\{1,2\}\), \(\{1,3\}\), and \(\{2,3\}\) for which we obtained significant values of the sobol indices, for all other groups the estimated sobol indices are very small and almost zero.
This section includes two examples. In the first example we reproduce an example from paper Gu, Palomo, and Berger (2019) and compare the prediction quality of the RKHS meta-model with the GP (Kriging) based meta-models from the RobusGaSP (Gu, Palomo, and Berger 2019) and DiceKriging (Roustant, Ginsbourger, and Deville 2012) packages. The objective is to evaluate the quality of the RKHS meta-model and to compare it with methods recently proposed for approximating complex models. In the first example we consider one-dimensional model and focus on the comparison between the true model and the estimated meta-model. In the second example we reproduce an example from paper Roustant, Ginsbourger, and Deville (2012) which allows us to compare the prediction quality of the RKHS meta-model with the Kriging based meta-model from DiceKriging package, as well as the estimation quality of the Sobol indices in our package with the well-known package sensitivity. For the sake of comparison between the three methods, the meta-models are calculated using the same experimental design and outputs, and the same kernel function available in three packages is used. However, in packages RobustGaSP and DiceKriging the range parameter \(r\) (see Table 1) in the kernel function is estimated by marginal posterior modes with the robust parameterization and by MLE with upper and lower bounds, respectively, while it is assumed to be fixed and set as \(\sqrt{3}/2\) in the RKHSMetaMod package.
Example 4. "The modified sine wave function":
We consider the \(1\)-dimensional modified sine wave function defined by \(m(X)=3sin(5\pi X)+cos(7\pi X)\) over \([0,1]\). The same experimental design as described in Gu, Palomo, and Berger (2019) is considered: the design matrix \(X\) is a sequence of \(12\) equally spaced points on \([0,1]\), and the response variable \(Y\) is calculated as \(Y = m(X)\):
X <- as.matrix(seq(0,1,1/11)); Y <- sinewave(X)
where sinewave function is defined in Gu, Palomo, and Berger (2019). We build the GP
based meta-models by the RobustGaSP and the
DiceKriging packages using the constant mean function and
kernel Matérn \(3/2\):
library(RobustGaSP)
res.rgasp <- rgasp(design=X, response=Y, kernel_type="matern_3_2")
library(DiceKriging)
res.km <- km(design=X, response=Y, covtype="matern3_2")
As \(d=1\), we have \(Dmax=1\). We consider the grid of values of
\(\mu_{(1:9)}=\mu_{max}/(\sqrt{n}\times
2^{(2:10)})\) and \(\gamma_{(1:5)}=(0.2,0.1,0.01,0.005,0)\).
The RKHS meta-models associated with the pair of values \((\mu_i,\gamma_j)\), \(i = 1,\cdots,9\), \(j=1,\cdots,5\) are estimated using the
RKHSMetMod function:
kernel <- "matern"; Dmax <- 1
gamma <- c(0.2, 0.1, 0.01, 0.005,0); frc <- 1/(0.5^(2:10))
res <- RKHSMetMod(Y, X, kernel, Dmax, gamma, frc, FALSE)
Given a testing dataset \((XT,YT)\),
the prediction errors associated with the obtained RKHS meta-models are
calculated using the PredErr function, and the
best RKHS meta-model is chosen to be the estimator of the model
\(m(X)\):
XT <- as.matrix(seq(0,1,1/11)); YT <- sinewave(XT)
Err <- PredErr(X, XT, YT, mu, gamma, res, kernel, Dmax)
To compare these three estimators in terms of the prediction quality, we perform prediction on \(100\) test points, equally spaced in \([0,1]\):
predict_X <- as.matrix(seq(0,1,1/99))
#prediction with the GP based meta-models:
rgasp.predict <- predict(res.rgasp, predict_X)
km.predict <- predict(res.km, predict_X, type='UK')
#prediction with the best RKHS meta-model:
res.predict <- prediction(X, predict_X, kernel, Dmax, res, Err)
The prediction results are plotted in Figure 3. The black circles that correspond to the prediction from the RKHSMetMod package are closer to the real output than the green and the blue circles corresponding to the predictive means from the RobustGaSP and DiceKriging packages.
The meta-model results are plotted in Figure 4. The prediction from the RKHSMetaMod package plotted as the black curve is much more accurate as an estimate of the true function (plotted in red) than the predictive mean from the RobustGaSP and DiceKriging packages plotted as the blue and green curves, respectively. As already noted by Gu, Palomo, and Berger (2019), for that sine wave example, the meta-model from the DiceKriging package "degenerates to the fitted mean with spikes at the design points".
Example 5. "A standard SA 8-dimensional example":
We consider the \(8\)-dimensional g-function of Sobol implemented in the package sensitivity: the function \(m(X)\) as defined in Equation (@ref(eq:gfct)) with coefficients \(c_1=0,\:c_2=1,\:c_3=4.5,\:c_4=9,\:(c_a)_{a=5,6,7,8}=99\). With these values of coefficients \(c_a\), the variables \(X_1\), \(X_2\), \(X_3\) and \(X_4\) explain \(99.96\%\) of the variance of function \(m(X)\) (see Table 10).
We consider the same experimental design as described in Roustant, Ginsbourger, and Deville (2012): the
design matrices \(X\) and \(XT\) are \(80\)-point optimal Latin Hypercube Samples
of the input variables generated by the optimumLHS function
of package lhs, and the response variables \(Y\) and \(YT\) are calculated as \(Y = m(X)\), and \(YT = m(XT)\) using sobol.fun
function of the package sensitivity:
n <- 80; d <- 8
library(lhs); X <- optimumLHS(n, d); XT <- optimumLHS(n, d)
library(sensitivity); Y <- sobol.fun(X); YT <- sobol.fun(XT)
Let us first consider the RKHS meta-model method. We set Dmax\(=3\), and we consider the grid of values of
\(\mu_{(1:9)}=\mu_{max}/(\sqrt{n}\times
2^{(2:10)})\), and \(\gamma_{(1:5)}=(0.2,0.1,0.01,0.005,0)\).
The RKHS meta-models associated with the pair of values \((\mu_i,\gamma_j)\), \(i = 1,\cdots,9\), \(j=1,\cdots,5\) are estimated using the
RKHSMetMod function:
kernel <- "matern"; Dmax <- 3
gamma <- c(0.2, 0.1, 0.01, 0.005,0); frc <- 1/(0.5^(2:10))
res <- RKHSMetMod(Y, X, kernel, Dmax, gamma, frc, FALSE)
Given the testing dataset \((XT,YT)\), the prediction errors associated
with the obtained RKHS meta-models are calculated using
PredErr function, and the best RKHS meta-model is
chosen to be the estimator of the model \(m(X)\). Finally, the Sobol indices are
computed for the best RKHS meta-model using the function
SI\(\_\)emp:
Err <- PredErr(X, XT, YT, mu, gamma, res, kernel, Dmax)
SI <- SI_emp(res, Err)
Secondly, let us build the GP based meta-model. We use the
km function of the package DiceKriging with the
constant mean function and kernel Matérn \(3/2\):
library(DiceKriging)
res.km <- km(design = X, response = Y, covtype = "matern3_2")
The Sobol indices associated with the estimated GP based meta-model
are calculated using fast99 function of the package
sensitivity:
SI.km <- fast99(model = kriging.mean, factors = d, n = 1000,
q = "qunif", q.arg = list(min = 0, max = 1), m = res.km)
where kriging.mean function is defined in Roustant, Ginsbourger, and Deville (2012).
The result of the estimation with the best RKHS meta-model and the Kriging based meta-model is drawn in Figure 5.
The black circles that correspond to the best RKHS meta-model are closer to the real output than the blue circles corresponding to the GP based meta-model from the DiceKriging package. Another way to evaluate the prediction quality of the estimated meta-models is to consider the mean square error of the fitted meta-model computed by \(\sum_{i=1}^{80}(m(X_i)-\widehat{f}(X_i))^2/80\). We obtained \(3.96\%\) and \(0.07\%\) for the Kriging based meta-model and the RKHS meta-model, respectively, which confirms the good behavior of the RKHS meta-model.
The estimated Sobol indices associated with the RKHS meta-model and the Kriging based meta-model are given in Table 10.
| \(v\) | \(\{1\}\) | \(\{2\}\) | \(\{3\}\) | \(\{4\}\) | \(\{1,2\}\) | \(\{1,3\}\) | \(\{1,4\}\) | \(\{2,3\}\) | \(\{2,4\}\) | \(\{1,2,3\}\) | \(\{1,2,4\}\) | sum |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| \(S_v\) | 71.62 | 17.90 | 2.37 | 0.72 | 5.97 | 0.79 | 0.24 | 0.20 | 0.06 | 0.07 | 0.02 | 99.96 |
| \(\widehat{S}_v\) | 75.78 | 17.42 | 1.71 | 0.47 | 4.00 | 0.05 | 0.07 | 0.28 | 0.09 | 0.00 | 0.00 | 99.87 |
| \(\widehat{S}_{km_v}\) | 71.18 | 15.16 | 1.42 | 0.44 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 88.20 |
As shown, with RKHS meta-model, we obtained non-zero values for the interactions of order two. Concerning the main effects, excepting the first one, the estimated Sobol indices with the RKHS meta-model are closer to the true ones. However, the interactions of order three are ignored by both meta-models. For a general comparison of the estimation quality of the Sobol indices, one may consider the criterion RE defined in Equation (@ref(eq:relerr)), which is equal to \(7.95\) for the Kriging based meta-model, and \(5.59\) for the RKHS meta-model. Comparing the values of RE, we can point out that the Sobol indices are better estimated with the RKHS meta-model in that model.
In this paper, we proposed an R package, called RKHSMetaMod,
that estimates a meta-model of a complex model \(m\). This meta-model belongs to a
reproducing kernel Hilbert space constructed as a direct sum of Hilbert
spaces (Durrande et al. 2013). The
estimation of the meta-model is carried out via a penalized
least-squares minimization allowing both to select and estimate the
terms in the Hoeffding decomposition, and therefore, to select the Sobol
indices that are non-zero and estimate them (Huet
and Taupin 2017). This procedure makes it possible to estimate
the Sobol indices of high order, a point known to be difficult in
practice. Using the convex optimization tools, RKHSMetaMod
package implements two optimization algorithms: the minimization of the
RKHS ridge group sparse criterion (@ref(eq:parametric)) and the RKHS
group lasso criterion (@ref(eq:Lasso)). Both of these algorithms rely on
the Gram matrices \(K_v\), \(v\in\mathcal{P}\) and their positive
definiteness. Currently, the package considers only uniformly
distributed input variables. If one is interested by another
distribution of the input variables, it suffices to modify the
calculation of the kernels \(k_{0a}\),
\(a=1,...,d\) in the function
calc\(\_\)Kv
of this package (see Remark 3). The available
kernels in the RKHSMetaMod package are: Gaussian kernel (with
the fixed range parameter \(r=1/2\)),
Matérn kernel (with the fixed range parameter \(r=\sqrt{3}/2\)), Brownian kernel, quadratic
kernel and linear kernel (see Table 1). With
regard to the problem being under study, one may consider other kernels
or kernels with different values of the range parameter \(r\) and add them easily to the list of the
kernels in the calc\(\_\)Kv function. For the large
values of \(n\) and \(d\) the calculation and storage of
eigenvalues and eigenvectors of all the Gram matrices \(K_v\), \(v\in\mathcal{P}\) require a lot of time and
a very large amount of memory. In order to optimize the execution time
and also the storage memory, except for a function that is written in R,
all of the functions of RKHSMetaMod package are written using
the efficient C++ libraries through RcppEigen and
RcppGSL packages. These functions are then interfaced with the
R environment in order to contribute a user friendly package.