Main fitting function
The main estimation function gplsim implements the profile likelihood algorithm of (Yu, Wu, and Zhang 2017) as well as the non-linear least square method of (Yu and Ruppert 2002) described in the previous section. The default method is the profile likelihood for responses from a general exponential family and non-linear least square method for others.
The usage and input arguments of the main fitting function gplsim are summarized as follows:
gplsim(Y, X, Z, family = gaussian, penalty = TRUE, profile = TRUE, user.init = NULL, bs= "ps", ...)
This function takes three required arguments: the response variable \(Y\) in vector format, the single-index nonlinear predictors \(X\) in the matrix or vector format, and the linear predictors \(Z\) in the matrix or vector format. Please note that all the input covariates are required to be numeric variables.
This function also takes several optional arguments for finer controls. The optional argument family is a family object for models from the built-in R package stats. This object is a list of functions and expressions for defining link and variance functions. Supported link functions include identity; logit, probit, cloglog; log; and inverse for the family distributions of Gaussian, Binomial, Poisson, and Gamma, respectively. Other families supported by glm and mgcv::gam are also supported. The optional argument penalty is a logical variable to specify whether to use penalized splines or un-penalized splines to fit the model. The default value is TRUE to implement penalized splines. The optional argument profile is a logical variable that indicates whether the algorithm with profile likelihood or the algorithm with NLS procedure is used. The default algorithm is set to the profile likelihood algorithm. The optional argument user.init is a numeric vector of the same length as the dimensionality of single-index predictors. The users can use this argument to pass in any appropriate user-defined initial single-index coefficients based on prior information or domain knowledge. The default value is NULL, which instructs the function to estimate initial single-index coefficients by a generalized linear model.
As we utilize mgcv::gam and mgcv::s as the underlying algorithms for the estimation of the unknown univariate function of the single index, there are several arguments that can be passed into mgcv::gam and mgcv::s for finer control. For example, the optional argument bs is a character variable that specifies the spline basis in the estimation of the single index function, and it will be passed into mgcv::s. The default has been set to “ps” (P-splines with B-spline basis) while other choices are “tr” (truncated power basis), “tp” (thin plate regression splines), and others (see the help page of mgcv::smooth.terms). Other mgcv::gam arguments can be passed to mgcv::s in ... includes the optional numeric arguments k, which is the dimension of the basis of the smooth terms and the arguments m, which is the order of the penalty for the smooth terms. Additionally, users can also pass arguments scale into gam in .… It is a numeric indicator with a default value set to -1. Any negative value including -1 indicates that the scale of response distribution is unknown and thus needs to be estimated. Another option is 0, indicating a scale of 1 for Poisson and binomial distribution and unknown for others. Any positive value will be taken as the known scale parameter. The optional argument smoothing_selection is a character variable that specifies the criterion used in the selection of the smoothing parameter \(\lambda\). This argument corresponds to the argument method in mgcv::gam, but it is renamed in this package to avoid confusion. The supported criteria include “GCV.Cp”,“GACV.Cp”, “ML”,“P-ML”, “P-REML” and “REML”, while the default criterion is “GCV.Cp”. For more details regarding arguments in mgcv::gam and mgcv::s, users may refer to the help page of mgcv::gam and mgcv::s.
The function gplsim returns an object class of gplsim, which extends the gam object and glm object.