Main R function dr4pl
We start with Error Case 1 shown in Figures 2 and 6. As can be seen in those figures, this data set contains an extreme outlier at the dose level \(10^{-5}\). If the R package drc is used to fit the 4PL model to this data set, the following error message is returned.
> library(drc)
>
> drm(Response~Dose, data = drc_error_1, fct = LL.4())
Error in optim(startVec, opfct, hessian = TRUE, method = optMethod, control =
list(maxit = maxIt, : non-finite finite-difference value [4]
Error in drmOpt(opfct, opdfct1, startVecSc, optMethod, constrained, warnVal, :
Convergence failed
Note that this error message indicates convergence failure but no
explanation about its cause is given. On the contrary, fitting the 4PL
model with dr4pl returns an object of the class
"dr4pl" which contains useful information about the
convergence failure and a possible solution.
> library(dr4pl)
>
> dr4pl.error.1 <- dr4pl(Response~Dose, data = drc_error_1, method.init = "logistic",
+ use.Hessian = TRUE)
> class(dr4pl.error.1)
[1] "dr4pl"
> dr4pl.error.1$convergence # TRUE indicates convergence success,
+ # FALSE indicates convergence failure.
[1] FALSE
In this code, some input arguments such as method.init
and use.Hessian are specified as different values from the
default values of dr4pl. For example, the default value of
method.init is Mead but it is specified as
logistic in the code. This is to ensure that the function
dr4pl works with the same parameter setting as
drc. The R package drc uses the logistic method as its
default initialization method and the Hessian matrix in its confidence
interval estimation. Note that the function dr4pl does not
throw an error but returns the member
dr4pl.error.1$convergence which indicates that convergence
failure happens during the optimization process.
As a possible solution to the convergence failure, the object
dr4pl.error.1 provides a member object
dr4pl.robust that results from robust estimation and the
outlier detection algorithm explained in Subsection 4.1. We call the
object dr4pl.error.1$dr4pl.robust the robust child
object of its parent object
dr4pl.error.1.
> dr4pl.robust.1 <- dr4pl.error.1$dr4pl.robust
> class(dr4pl.error.1$dr4pl.robust)
[1] "dr4pl"
> dr4pl.robust.1$convergence
[1] TRUE
It can be seen from these code lines that the robust child object
does not report the convergence failure issue. To check results of
robust estimation and outlier detection, we check the plot of the object
dr4pl.robust.1.
> dr4pl.robust.1$idx.outlier
[1] 132 140 127 134 133 137 141 139 130 129 128 102
> plot(dr4pl.robust.1, indices.outlier = dr4pl.robust.1$idx.outlier)
The member idx.outlier of the object
dr4pl.robust.1 shows the indices of the outliers in the
input data frame drc_error_1 detected by our algorithm. It
can be seen that the last sentence in this code generates the same plot
as the upper left plot of Figure 6. As mentioned in Subsection
4.1, the
absolute loss is given as the default robust estimation method. This
implies that adopting the absolute loss successfully resolves the
convergence failure problem of Error Case 1.
Our R package dr4pl provides other real-world data sets than
Error Cases 1 - 4. We analyze the data set sample_data_5 to
compare the two different initialization methods, the logistic and Mead
methods, which were presented in Section 5.
> dr4pl.logistic.5 <- dr4pl(Response~Dose,
+ data = sample_data_5,
+ method.init = "logistic")
> ggplot.logistic <- plot(dr4pl.logistic.5, text.title = "Logistic method")
>
> dr4pl.Mead.5 <- dr4pl(Response~Dose,
+ data = sample_data_5)
> ggplot.Mead <- plot(dr4pl.Mead.5, text.title = "Mead's method")
>
> grid.arrange(ggplot.logistic, ggplot.Mead, nrow = 1, ncol = 2)
Running this code generates the two plots in Figure 7. Note that outliers are not denoted by red triangles in these plots since they are generated by the parent objects but not by their robust child objects. It can be seen from the figure that the data set has the support problem in the sense that it is hard to determine an IC50 parameter value visually and data points on the right side of possible IC50 parameter values are not sufficiently provided. Mead’s method shown in the right plot seems to yield a closer fit to the right-most data point. This coincides with our explanation in Subsection 5.2 that the method can result in better final parameter estimates particularly when data suffer from the support problem.