Abstract

Inverse estimation is a classical and well-known problem in regression. In simple terms, it involves the use of an observed value of the response to make inference on the corresponding unknown value of the explanatory variable. To our knowledge, however, statistical software is somewhat lacking the capabilities for analyzing these types of problems. In this paper, we introduce investr (which stands for inverse estimation in R), a package for solving inverse estimation problems in both linear and nonlinear regression models.¹

Introduction

Consider the regression model \(\mathcal{Y}_i = f\left( x_i; \boldsymbol{\beta} \right) + \epsilon_i\) \((i = 1, \dotsc, n)\), where \(f\) is a known expectation function (called a calibration curve) that is monotonic over the range of interest and \(\epsilon_i \stackrel{iid}{\sim} \mathcal{N}\left( 0, \sigma^2 \right)\). A common problem in regression is to predict a future response \(\mathcal{Y}_0\) from a known value of the explanatory variable \(x_0\). Often, however, there is a need to do the reverse; that is, given an observed value of the response \(\mathcal{Y} = y_0\), estimate the unknown value of the explanatory variable \(x_0\). This is known as the calibration problem, though we refer to it more generally as inverse estimation. In this paper, we consider only controlled calibration, where the values of the explanatory variables are fixed by design. A more thorough overview of the calibration problem, including Bayesian approaches and multivariate calibration, is given in Osborne (1991).

There are three main functions available in the investr package (Greenwell 2014):

calibrate;
invest;
plotFit.

calibrate operates on objects of class "lm" and can only be used when the expectation function has the form \(f(x_i; \boldsymbol{\beta}) = \beta_0 + \beta_1 x_i\) (i.e., the simple linear regression model), where closed-form solutions are available for calculating confidence intervals for \(x_0\). For more complicated models (e.g., polynomial and nonlinear regression), closed-form expressions are usually not available and iterative techniques will be required. This is the purpose of the function invest, which calculates the point estimate and confidence intervals for \(x_0\) by calling the function uniroot from the stats package. The function plotFit produces a scatterplot of the data with fitted regression curve and the option to add confidence/prediction bands for the response (pointwise or adjusted). It can be used with single-predictor objects of class "lm" or "nls"; however, for objects of class "nls", confidence/prediction bands are based on the linear approximation and can be misleading (Bates and Watts 1988, 65). The development version of investr can be found on GitHub at https://github.com/w108bmg/investr. To report bugs or issues, contact the main author or submit them to https://github.com/w108bmg/investr/issues.

Calibration for straight line regression

Consider the most common calibration model, \(\mathcal{Y}_i = \beta_0 + \beta_1 x_i + \epsilon_i\) \((i = 1, \dotsc, n)\), where \(x_i\) is fixed and \(\epsilon_i \stackrel{iid}{\sim} \mathcal{N}\left( 0, \sigma^2 \right)\). Suppose we obtain a series of \(m\) observations \(y_{01}, \dotsc, y_{0m}\) which are associated with the same (unknown) \(x_0\). It can be shown (F. A. Graybill 1976) that the maximum likelihood (ML) estimator of \(x_0\), also known as the classical estimator, is

\[\label{eq:x0-mle} \widehat{x}_0 = \frac{\bar{y}_0 - \widehat{\beta}_0}{\widehat{\beta}_1}, (\#eq:x0-mle) \]

where \(\widehat{\beta}_0\) and \(\widehat{\beta}_1\) are the usual ML estimators of \(\beta_0\) and \(\beta_1\), respectively, and \(\bar{y}_0 = \sum_{j = 1}^m y_{0j}/m\). The classical estimator in general is computed as \(\widehat{x}_0 = f^{-1}\left( \bar{y}_0; \widehat{\boldsymbol{\beta}} \right)\); that is, by inverting the fitted calibration curve at \(\bar{y}_0\) (Eisenhart 1939). Since \(\widehat{x}_0\) is a ratio of jointly normal random variables, it is not surprising to learn that it does not have any finite moments (think of a standard Cauchy distribution). The sampling distribution of \(\widehat{x}_0\) is complicated, but fortunately, not required for setting an exact \(100(1 - \alpha)\%\) confidence interval on \(x_0\). This is discussed in the next section.

Inversion interval

It can be shown (see, for example, Draper and Smith 1998, 47–51) that an exact \(100(1 - \alpha)\%\) confidence interval for \(x_0\) is given by

\[\label{eq:x0-inversion-interval} \widehat{x}_0 + \frac{\left( \widehat{x}_0 - \bar{x} \right)g \pm \left( t\widehat{\sigma}/\widehat{\beta}_1 \right)\sqrt{\left( \widehat{x}_0 - \bar{x} \right)^2 / S_{xx} + (1 - g)\left( \frac{1}{m} + \frac{1}{n} \right)}}{1 - g}, (\#eq:x0-inversion-interval) \]

where \(S_{xx} = \sum_{i=1}^n \left( x_i - \bar{x} \right)^2\), \(g = \left( t^2 \widehat{\sigma}^2 \right) / \left( \widehat{\beta}_1^2 S_{xx} \right)\) and \(t = t_{\alpha/2, n + m - 3}\) is the upper \(1 - \alpha\) percentile of a Student’s \(t\)-distribution with \(n + m - 3\) degrees of freedom. The inversion interval @ref(eq:x0-inversion-interval) can also be obtained using a fiducial argument, as in Fieller (1954). For the special case \(m = 1\), the inversion interval is equivalent to inverting a \(100(1 - \alpha)\%\) prediction interval for the response. In other words, if one draws a horizontal line through a scatterplot of the data at \(y_0\), then the abscissas of its intersection with the usual (pointwise) prediction band for \(f\) correspond to the endpoints of the inversion interval @ref(eq:x0-inversion-interval). This interval should only be used when the usual \(\mathcal{F}\) test for testing \(\mathcal{H}_0: \beta_1 = 0\) versus \(\mathcal{H}_1: \beta_1 \ne 0\) can be rejected at the specified \(\alpha\) level; in other words, when the regression line is not “too” flat. If \(\mathcal{H}_0\) is not rejected, then such a confidence interval for \(x_0\) may result in either the entire real line or two semi-infinite intervals (Franklin A. Graybill and Iyer 1994, 429)—see Figure 2. The plotFit function in the investr package can be used for drawing scatterplots with the fitted model and confidence/prediction bands. To calculate the inversion interval for the linear calibration problem, we use the calibrate function and specify the option interval = "inversion" (the default) as in the following example.

The data frame arsenic contains the true amounts of arsenic present in 32 water samples (Franklin A. Graybill and Iyer 1994). Also present, is the amount of arsenic measured by some field test, which is subject to error. A new water sample is obtained and subjected to the field test producing a reading of 3.0 \(\mu\)g/ml. It is desired to infer the true amount of arsenic present in the sample. The following code fits a simple linear regression model to the arsenic data and then uses the calibrate function to compute the ML estimate and corresponding 90% calibration interval based on Equation @ref(eq:x0-inversion-interval).

library(investr)
mod <- lm(measured ~ actual, data = arsenic)
(res <- calibrate(mod, y0 = 3, interval = "inversion", level = 0.9))
  
## Figure 1
plotFit(mod, interval = "prediction", level = 0.9, shade = TRUE, col.pred = "skyblue")
abline(h = 3, v = c(res$lower, res$estimate, res$upper), lty = 2)

Running the above block of code produces Figure 1 and the following output to the R console:

estimate    lower    upper 
  2.9314   2.6035   3.2587

where estimate is the ML estimate @ref(eq:x0-mle), and lower/upper correspond to the lower/upper bounds of the 90% inversion interval for \(x_0\) @ref(eq:x0-inversion-interval). If instead the new water sample was subjected to the field test three times, thereby producing three response values corresponding to \(x_0\), say 3.17, 3.09, and 3.16 \(\mu\)g/ml, we would simply supply calibrate with a vector of these values as in

calibrate(mod, y0 = c(3.17, 3.09, 3.16), interval = "inversion", level = 0.9)

If interval = "inversion", and the slope of the model is not ignificant at the specified \(\alpha\) level, then finite confidence limits for \(x_0\) will not be produced. For example, suppose badfit is an "lm" object for which the slope is not significant at the \(\alpha = 0.1\) level. Then, as illustrated in Figure 2,

calibrate(badfit, y0 = 10, level = 0.9)

will either produce two semi-infinite intervals, e.g.,

Error: The calibration line is not well determined. The resulting
confidence region is the union of two semi-infinite intervals:
( -Inf , -282.0006 ) U ( 393.1267 , Inf)

or the entire real line, e.g.,

estimate    lower    upper 
-97.5987     -Inf      Inf 
Warning message:
The calibration line is not well determined.

Figure 1: Scatterplot of the arsenic data with fitted calibration line and pointwise prediction band. A horizontal reference line is drawn through the observed value \(y_0\) = 3. The vertical lines identify the position of the point estimate and 90% confidence bounds for \(x_0\).

Figure 2: Hypothetical \(1-\alpha\) (pointwise) prediction bands. Left: Horizontal line at \(y_0\) intersects the prediction band at two points, resulting in a finite interval. Middle: Horizontal line at \(y_0\) does not intersect the prediction band at all resulting in an infinite interval. Right: Horizontal line at \(y_0\) only intersects with one side of the prediction band resulting in two semi-infinite intervals.

Wald interval

Another common approach to computing calibration intervals is to use the delta method (Dorfman 1938). It is easy to show that an approximate standard error for the ML estimator @ref(eq:x0-mle), based on a first-order Taylor series approximation, is given by

\[\label{eq:x0-se} \widehat{\mathrm{se}}\left( \widehat{x}_0 \right) = \frac{\widehat{\sigma}}{\left| \widehat{\beta}_1 \right|} \sqrt{\frac{1}{m} + \frac{1}{n} + \frac{\left( \widehat{x}_0 - \bar{x} \right)^2}{S_{xx}}}. (\#eq:x0-se) \]

Assuming large sample normality for \(\widehat{x}_0\) leads to an approximate \(100(1 - \alpha)\%\) Wald confidence interval for \(x_0\) of

\[\label{eq:x0-wald-interval} \widehat{x}_0 \pm t_{\alpha/2, n+m-3} \frac{\widehat{\sigma}}{\left| \widehat{\beta}_1 \right|} \sqrt{\frac{1}{m} + \frac{1}{n} + \frac{\left( \widehat{x}_0 - \bar{x} \right)^2}{S_{xx}}}. (\#eq:x0-wald-interval) \]

This is equivalent to taking \(g = 0\) in the inversion interval @ref(eq:x0-inversion-interval). Unlike the inversion interval, though, the Wald interval always exists and is symmetric about \(\widehat{x}_0\). This symmetry is attractive, but not always realistic, such as when the calibration curve \(f\) is nonlinear and has horizontal asymptotes. To obtain a Wald-type interval we specify interval = "Wald" in the call to calibrate. For instance,

calibrate(mod, y0 = 3, interval = "Wald", level = 0.9)

produces the output

estimate    lower    upper       se 
  2.9314   2.6040   3.2589   0.1929

The point estimate remains unchanged (as expected) but the calibration interval is slightly different. The major benefit of using the delta method to compute calibration intervals is that it always exists and provides us with an asymptotic estimate of the standard error, which in this case gives \(\widehat{\mathrm{se}} = 0.1929\). The bootstrap, as discussed in Section Bootstrap calibration intervals, also provides calibration intervals and an estimate of the standard error, but does not require large samples or specific distribution assumptions.

Confidence interval for a specified mean response

Instead of inferring the value of \(x_0\) that corresponds to an observed value of the response \(y_0\), suppose we want to infer the value of \(x_0\) that corresponds to a specified value of the mean response, say \(f\left( x_0; \boldsymbol{\beta} \right) = \mu_0\). The obvious ML estimate of \(x_0\) is

\[\label{eq:x0-mle-regulation} \widehat{x}_0 = \frac{\mu_0 - \widehat{\beta}_0}{\widehat{\beta}_1}, (\#eq:x0-mle-regulation) \]

which is the same as Equation @ref(eq:x0-mle) but with \(\bar{y}_0\) replaced with \(\mu_0\), a fixed population parameter. This is analogous to the difference between (i) predicting a future value of the response corresponding to a given value of the explanatory variable and (ii) estimating the mean response that corresponds to a particular value of the explanatory variable. Thus, the point estimates @ref(eq:x0-mle) and @ref(eq:x0-mle-regulation) are the same but the former has greater variability inherited from the variance of \(\overline{\mathcal{Y}}_0\). This is sometimes referred to as regulation (as opposed to calibration) and is described in more detail in Franklin A. Graybill and Iyer (1994, chap. 6). The confidence interval formulas for \(x_0\) corresponding to a specified mean response (i.e., regulation) are the same as those given in Equations @ref(eq:x0-inversion-interval) and @ref(eq:x0-wald-interval), but with \(1/m\) replaced with \(0\) and \(t_{\alpha/2, n + m - 3}\) replaced with \(t_{\alpha/2, n - 2}\). For instance, the Wald interval for \(x_0\) corresponding to \(\mu_0\) is simply \[\widehat{x}_0 \pm t_{\alpha/2, n - 2}\frac{\widehat{\sigma}}{\left| \widehat{\beta}_1 \right|} \sqrt{\frac{1}{n} + \frac{\left( \widehat{x}_0 - \bar{x} \right)^2}{S_{xx}}},\]

where \(\widehat{x}_0\) is calculated as in Equation @ref(eq:x0-mle-regulation). To obtain calibration intervals corresponding to a specified mean response, use the option mean.response = TRUE.

To illustrate, consider the crystal growth data taken from Franklin A. Graybill and Iyer (1994, 119). These data are from an experiment in which the weight in grams of 14 crystals were recorded after letting the crystals grow for different (predetermined) amounts of time in hours. The weight of each crystal is plotted against time in Figure 3. Suppose we want to estimate the growth time in hours that corresponds to an average weight of 8 grams; that is, we want to estimate \(x_0\) corresponding to \(\mu_0 = 8\). The following code chunk fits a simple linear regression model to the crystal growth data and then computes the ML estimate and a 95% calibration interval for \(x_0\).

mod <- lm(weight ~ time, data = crystal)
(res <- calibrate(mod, y0 = 8, mean.response = TRUE))
  
## Figure 3
plotFit(mod, interval = "confidence", pch = 19, shade = TRUE, col.conf = "plum", 
        extend.range = TRUE)
abline(h = 8, v = c(res$lower, res$estimate, res$upper), lty = 2)

The output for this code chunk should be

estimate    lower    upper 
 15.8882  14.6590  17.1596

Thus, in order to produce crystals with an average weight of 8 grams, they should be grown for an estimated 15.8882 hours. A 95% confidence interval for the growth time is (14.6590, 17.1596). Obviously, if mean.response = TRUE, then y0 can only take a single value; otherwise, an error will be displayed as in the following:

calibrate(mod, y0 = c(8, 9), mean.response = TRUE)

which displays the message

Error in calibrate.default(cbind(x, y), ...) : 
  Only one mean response value allowed.

Figure 3: Scatterplot of the crystal growth data with fitted calibration line and pointwise confidence band. A horizontal reference line is drawn at \(\mu_0\) = 8. The vertical lines identify the position of the point estimate and 95% confidence bounds for the growth time \(x_0\).

This type of calibration problem is similar to computing the median effective dose (\(ED_{0.5}\)), or more generally \(ED_p\) where \(0 < p < 1\), in binary response models. In R, these models are usually fit with the glm function from the stats package. The function dose.p from the MASS package (Venables and Ripley 2002) can then be used to compute the ML estimate of \(ED_p\) for specified \(p\). An estimate of the asymptotic standard error based on the delta method is also given which can be used to calculate a Wald-based confidence interval for \(ED_p\). A future release of investr will likely make an inversion-type interval available for these models as well (i.e., by inverting a confidence interval for the mean response on the link scale). The package drc (Ritz and Streibig 2005), for fitting dose-response curves, may also be of interest.

Simultaneous calibration intervals

The calibration intervals discussed so far are one-at-a-time intervals. If \(k\) new values of the response are observed where each corresponds to a different unknown, say \(x_{01}, \dotsc x_{0k}\), then we need to adjust the critical value used in the inversion and Wald intervals accordingly. The simplest approaches are of course, the Bonferroni and Scheffé procedures. These can be computed by specifying the adjust option which can take any of the following arguments: "none" (the default), "bonferroni", or "scheffe". The value \(k\) also needs to be specified. See Miller (1981, chap. 3) for more details.

Nonlinear and polynomial calibration

In application, many relationships are nonlinear (e.g., dose-response curves). The classical estimator along with the inversion and Wald intervals can easily be extended to such nonlinear calibration curves. However, classical inference in these models (such as prediction intervals) are based on large samples and linear approximations (see Bates and Watts 1988, chap. 2). Thus, for nonlinear calibration curves, the inversion interval provides only an approximate \(100(1 - \alpha)\%\) confidence interval for \(x_0\) as does the Wald interval. Calibration in nonlinear models is discussed in further detail in Schwenke and Milliken (1991), Seber and Wild (2003, 245–50), and Huet (2004, chap. 5). For calibration in polynomial models, see Brown (1993, 47–88) and Seber and Lee (2003, 172).

The invest function can be used for inverse estimation with any univariate regression model in R that inherits from class "lm" or "nls" (with the exception of weighted fits). For instance, consider the regression model \(\mathcal{Y}_i = f\left( x_i; \boldsymbol{\theta} \right) + \epsilon_i\) \((i = 1, \dotsc, n)\), where \(f\) may or may not be linear in \(\boldsymbol{\theta}\). If we wish to estimate \(x_0\) given an observation \(y_0\), then the point estimate \(\widehat{x}_0\) is given by solving \(y_0 = f\left(x; \widehat{\boldsymbol{\theta}}\right)\) for \(x\). The solution will be unique as long as \(f\) is monotonic in the region of interest. The invest function computes \(\widehat{x}_0\) by calling the stats functions predict and uniroot to solve \[y_0 - f\left(x; \widehat{\boldsymbol{\theta}}\right) = 0\] numerically for \(x\).

Approximate confidence intervals

Equation @ref(eq:x0-inversion-interval) gives a closed-form expression for the inversion interval for the case of simple linear regression. In more complicated cases, such an expression is not available and the interval must be found numerically. It can be shown (see, for example, Seber and Wild 2003, 245–46) that

\[\label{eq:predictive-pivot} \mathcal{T}_{x_0} = \frac{\mathcal{Y}_0 - f\left(x_0; \widehat{\boldsymbol{\theta}}\right)}{\left\{\widehat{\sigma}^2 + \widehat{S}_{x_0}^2\right\}^{1/2}} \stackrel{\cdot}{\sim} t_{n-p}, (\#eq:predictive-pivot) \]

where \(\widehat{S}_{x_0}^2\) is the estimated variance of \(f\left(x_0; \widehat{\boldsymbol{\theta}}\right)\). An approximate \(100(1-\alpha)\%\) confidence interval for \(x_0\) is then given by the set

\[\label{eq:inversion} \bigg\{ x: t_{\alpha/2, n-p} < \mathcal{T}_x < t_{1-\alpha/2, n-p} \bigg\}. (\#eq:inversion) \]

Essentially, invest finds the lower and upper limits for this interval by solving the equations \[\mathcal{T}_x - t_{\alpha/2, n-p} = 0 \quad \text{and} \quad \mathcal{T}_x - t_{1-\alpha/2, n-p} = 0\] numerically for \(x_0\). For the special case of the simple linear regression model, these limits coincide with Equation @ref(eq:x0-inversion-interval) and the coverage probability is exactly \(1-\alpha\).

The Wald interval for \(x_0\) is also easily extended. It has the basic form: \(\widehat{x}_0 \pm t_{\alpha/2, n-p}\widehat{\mathrm{se}}\left(\widehat{x}_0\right)\), where \(p\) is the dimension of \(\boldsymbol{\theta}\). The estimated standard error \(\widehat{\mathrm{se}}\left(\widehat{x}_0\right)\) is based on a first-order Taylor series approximation. For the special case of the simple linear regression model, this approximation results in Equation @ref(eq:x0-se).

Since the point estimate and confidence intervals for \(x_0\) are obtained numerically using uniroot, invest has the additional options lower, upper, tol, and maxiter. See the reference manual for details.

To get an idea of how invest works, consider the data in Figure 4 which were generated from the model \(\mathcal{Y} = 5 + x-\sin(x) + 1.5\mathcal{Z}\), where \(\mathcal{Z}\) is a standard normal random variable. Suppose we want to estimate \(x_0\) given a new observation, say \(y_0 = 22\). Fitting a model of the form \(f(x; \theta) = \theta_1 + \theta_2\left[x-\sin(x)\right]\) to the sample, we obtain \(\widehat{f}(x) = 5.490 + 0.941\left[x-\sin(x)\right]\), which is certainly invertible, but \(\widehat{f}^{-1}\) cannot be expressed in closed-form using a finite number of terms; thus, \(\widehat{x}_0 = \widehat{f}^{-1}(y_0 = 22)\) must be obtained numerically. The following chunk of code generates the sample data, calculates \(\widehat{x}_0\) and an approximate 95% Wald interval for \(x_0\) using the invest function.

set.seed(101) # for reproducibility
x <- rep(seq(from = 0, to = 25, by = 2), each = 2)
d <- data.frame(x, y = 5 + x - sin(x) + rnorm(length(x), sd = 1.5))
mod <- lm(y ~ I(x - sin(x)), data = d)
res <- invest(mod, y0 = 22, interval = "Wald")

## Figure 4
plotFit(mod, interval = "prediction", shade = TRUE, col.pred = "seagreen1",
        extend.range = TRUE)
abline(h = 22, v = res$estimate, lty = 2)

The point estimate is \(\widehat{x}_0 = 16.7053\) with an estimated standard error of 0.8909. The code used by invest for obtaining the point estimate is essentially

uniroot(function(x) predict(mod, newdata = list("x" = x)) - 22, 
        lower = min(d$x), upper = max(d$x))$root

The code for obtaining the standard error used in the Wald interval is slightly more involved:

## Write x0.hat as function of parameters (theta1.hat, theta2.hat, Y0)
x0.fun <- function(params, object = mod) {
  object$coefficients <- params[1:2]
  uniroot(function(x) predict(object, list("x" = x)) - params[3], 
          lower = 0, upper = 25, tol = 1e-10)$root
}

## Variance-covariance matrix of (theta1.hat, theta2.hat, Y0)'
covmat <- diag(3)
covmat[1:2, 1:2] <- vcov(mod)
covmat[3, 3] <- summary(mod)$sigma^2

## Numerically evaluate gradient
params <- c(coef(mod), y0 = 22)
grad <- attr(numericDeriv(quote(x0.fun(params)), "params"), "gradient")

## Calculate standard error
(se <- sqrt(grad %*% covmat %*% t(grad)))

Figure 4: Scatterplot of simulated data with fitted calibration line and pointwise prediction band. A horizontal reference line is drawn through the observed value \(y_0\) = 22. The vertical line identifies the position of the point estimate \(\widehat{x}_0\).

The following example uses the nasturtium data from the drc package. These data were analyzed in Racine-Poon (1988) using an approximate Bayesian approach. Bioassays were performed on a type of garden cress called nasturtium. Six replicates of the response (plant weight in mg after the third week of growth) were measured at seven preselected concentrations of an agrochemical. The weights corresponding to three new soil samples, all sharing the same (unknown) concentration \(x_0\), were observed to be 309, 296, and 419 mg. The block of code below fits a log-logistic model \[f(x; \boldsymbol{\theta}) = \left\{ \begin{array}{l l} \theta_1, &\quad x = 0, \\ \theta_1 / \left[1 + \exp\left\{\theta_2 + \theta_3\ln(x)\right\}\right], &\quad x > 0, \end{array} \right.\] and computes an approximate 95% inversion interval for \(x_0\).

## Load package containing nasturtium data
library(drc)
  
## Fit log-logistic model
mod <- nls(weight ~ theta1/(1 + exp(theta2 + theta3 * log(conc))),
           start = list(theta1 = 1000, theta2 = -1, theta3 = 1), 
           data = nasturtium)
plotFit(mod, interval = "prediction") # figure not shown
  
## Compute approximate 95% inversion interval
invest(mod, y0 = c(309, 296, 419), interval = "inversion")

The interval obtained is (1.7722, 2.9694) with a point estimate of 2.2639. We can check the point estimate manually. Some algebra gives \[\widehat{x}_0 = \exp\left\{\frac{\log\left(\widehat{\theta}_1/\bar{y}_0 - 1\right) - \widehat{\theta}_2}{\widehat{\theta}_3}\right\} = 2.2639.\]

A Wald interval can also be obtained by running the following line of code:

invest(mod, y0 = c(309, 296, 419), interval = "Wald")

the output for which is

estimate    lower    upper       se 
  2.2639   1.6889   2.8388   0.2847

In certain cases (i.e., when \(\widehat{x}_0\) can be written in closed-form), one can use the very useful deltaMethod function from the car package (Fox and Weisberg 2011) to compute the approximate standard error used in the Wald interval. For the nasturtium example, the minimal code to obtain \(\widehat{\mathrm{se}}\left(\widehat{x}_0\right)\) is

## Using the deltaMethod function in the car package
library(car)
covmat <- diag(4)
covmat[1:3, 1:3] <- vcov(mod)
covmat[4, 4] <- summary(mod)$sigma^2 / 3 # since length(y0) = 3
(se <- deltaMethod(c(coef(mod), y0.bar = mean(c(309, 296, 419))), 
                   g = "exp((log(theta1 / y0.bar - 1) - theta2) / theta3)", 
                   vcov. = covmat)$SE)

which produces an estimated standard error of 0.2847019, the same as that obtained automatically by invest.

As indicated by the bootstrap distribution obtained in the next section, the symmetric Wald interval obtained here seems unrealistic for this problem. In the next section, we show how to obtain a bias-corrected and accelerated (\(BC_a\)) bootstrap confidence interval for \(x_0\) using the boot package (Canty and Ripley 2014).

Bootstrap calibration intervals

The bootstrap (Efron 1979) provides an alternative means for computing calibration intervals. This is useful in nonlinear calibration problems where inference traditionally relies on large samples, approximate normality, and linear approximations. A practical guide to the bootstrap is provided by Davison and Hinkley (1997) and the companion R package boot, and bootstrap resampling for controlled calibration is discussed in Jones and Rocke (1999). Although it is likely for a "bootstrap" option to appear in a future release of investr, it is quite simple to set up using the recommended boot package. First, we discuss a naive approach to calculating bootstrap calibration intervals.

A simple, but ultimately wrong approach to resampling in controlled calibration is demonstrated in the following example for the nasturtium data:

library(boot)

## Function to compute estimate of x0
x0.fun <- function(object, y) {
  theta <- unname(coef(object))
  exp((log(theta[1] / mean(y) - 1) - theta[2]) / theta[3]) 
}
  
## Bootstrap setup
y0 <- c(309, 296, 419)
res <- resid(mod) - mean(resid(mod)) # center the residuals
n <- length(res)
boot.data <- data.frame(nasturtium, res = res, fit = fitted(mod))
boot.fun <- function(data, i) {
  boot.mod <- nls(fit + res[i] ~ theta1 / (1 + exp(theta2 + theta3 * log(conc))),
                  start = list(theta1 = 1000, theta2 = -1, theta3 = 1), data = data)
                    
  ## Make sure the original estimate also gets returned
  if (all(i == 1:n)) x0.fun(mod, y0) else x0.fun(boot.mod, y0)
}
  
## Run bootstrap simulation (takes about 50s on a standard laptop)
set.seed(123) # for reproducibility
res <- boot(boot.data, boot.fun, R = 9999) # collect 9,999 bootstrap samples
boot.ci(res, type = "bca") # obtain BCa confidence interval for x0

This produces an approximate 95% \(BC_a\) calibration interval of (2.04, 2.52), quite shorter than the ones produced by the inversion and Wald methods in the previous section. Did the bootstrap really do that well? The answer here is no, but it is not the bootstrap’s fault. Recall that \(\bar{y}_0\) is an observed value of the random variable \(\overline{\mathcal{Y}}_0\) which has variance \(\sigma^2/m\). This source of variability is ignored in the bootstrap Monte Carlo simulation above, which treats \(\bar{y}_0\) as a fixed constant. Compare the interval just obtained with the output from

invest(mod, y0 = mean(c(309, 296, 419)), mean.response = TRUE)

We can get a hold of this extra source of variability by again sampling with replacement from the centered residuals. In particular, we resample the responses \(y_{0j}\) \((j = 1, 2, \dotsc, m)\), by randomly selecting \(m\) additional residuals \(e_j^\star\) from the centered residuals and calculating the bootstrap responses \[y_{0j}^\star = \bar{y}_0 + e_j^\star, \quad j = 1, 2, \dotsc, m.\] Let \(\bar{y}_0^\star = \sum_{j=1}^m y_{0j}^\star/m\). The correct bootstrap replicate of \(x_0\) is given by \(\widehat{x}_0^\star = f^{-1}\left(\bar{y}_0^\star; \widehat{\boldsymbol{\theta}}^\star\right)\). To implement this, all we need to do is make the following changes to boot.fun (the changes are colored magenta):

boot.fun <- function(data, i) {
  boot.mod <- nls(fit + res[i] ~ theta1 / (1 + exp(theta2 + theta3 * log(conc))),
                  start = list(theta1 = 1000, theta2 = -1, theta3 = 1), data = data)
                    
  ## Simulate the correct variance
  Y0 <- y0 + sample(data$res, size = length(y0), replace = TRUE)
    
  ## Make sure the original estimate also gets returned
  if (all(i == 1:n)) x0.fun(mod, y0) else x0.fun(boot.mod, Y0)
}

Rerunning the previous example with the modified boot.fun function produces a more realistic 95% confidence interval for \(x_0\) of (1.818, 2.950) and an estimated standard error for \(\widehat{x}_0\) of 0.2861 (similar to that obtained by the delta method earlier). Figure 5 shows a histogram and normal Q-Q plot of the 9,999 bootstrap replicates of \(\widehat{x}_0\). The bootstrap distribution of \(\widehat{x}_0\) is comparable to the approximate posterior density of \(x_0\) given in Racine-Poon (1988, fig. 9). The bootstrap distribution is skewed to the right and clearly not normal, suggesting that the Wald interval may not be realistic for this problem.

Figure 5: The left panel shows a histogram of the bootstrap replicates with a vertical line indicating the position of the original estimate \(\widehat{x}_0\). The right panel shows a normal Q-Q plot of the bootstrap replicates.

Summary

We introduced the investr package for computing point estimates along with inversion and Wald-based confidence intervals for linear and nonlinear calibration problems with constant variance. We also showed how the boot package can be used for constructing approximate \(100(1 - \alpha)\%\) calibration intervals, which for convenience, may be incorporated into a future release of investr. The authors are currently working on extending the package to handle the case of heteroscedastic errors, random coefficient models [e.g., objects of class "lme" from the recommended nlme package; Pinheiro et al. (2014)]), and even multivariate calibration problems (e.g., objects of class "mlm").

Acknowledgements

The authors would like to thank two anonymous reviewers and the Editor for their helpful comments and suggestions.

The views expressed in this paper are those of the authors, and do not reflect the official policy or position of the United States Air Force, Navy, Department of Defense, or the U.S. Government.↩︎

investr: An R Package for Inverse Estimation

Brandon M. Greenwell

Christine M. Schubert Kabban

2014-05-27