(Non-)Linear Tests for Null Hypotheses, Equivalence, Non Superiority, and Non Inferiority
Source:R/hypotheses.R
hypotheses.Rd
Uncertainty estimates are calculated as first-order approximate standard errors for linear or non-linear functions of a vector of random variables with known or estimated covariance matrix. In that sense, hypotheses
emulates the behavior of the excellent and well-established car::deltaMethod and car::linearHypothesis functions, but it supports more models; requires fewer dependencies; expands the range of tests to equivalence and superiority/inferiority; and offers convenience features like robust standard errors.
To learn more, read the hypothesis tests vignette, visit the package website, or scroll down this page for a full list of vignettes:
Warning #1: Tests are conducted directly on the scale defined by the type
argument. For some models, it can make sense to conduct hypothesis or equivalence tests on the "link"
scale instead of the "response"
scale which is often the default.
Warning #2: For hypothesis tests on objects produced by the marginaleffects
package, it is safer to use the hypothesis
argument of the original function. Using hypotheses()
may not work in certain environments, in lists, or when working programmatically with *apply style functions.
Usage
hypotheses(
model,
hypothesis = NULL,
vcov = NULL,
conf_level = 0.95,
df = Inf,
equivalence = NULL,
FUN = NULL,
...
)
Arguments
- model
Model object or object generated by the
comparisons()
,slopes()
,predictions()
, ormarginal_means()
functions.- hypothesis
specify a hypothesis test or custom contrast using a numeric value, vector, or matrix, a string, or a string formula.
Numeric:
Single value: the null hypothesis used in the computation of Z and p (before applying
transform
).Vector: Weights to compute a linear combination of (custom contrast between) estimates. Length equal to the number of rows generated by the same function call, but without the
hypothesis
argument.Matrix: Each column is a vector of weights, as describe above, used to compute a distinct linear combination of (contrast between) estimates. The column names of the matrix are used as labels in the output.
String formula to specify linear or non-linear hypothesis tests. If the
term
column uniquely identifies rows, terms can be used in the formula. Otherwise, useb1
,b2
, etc. to identify the position of each parameter. Examples:hp = drat
hp + drat = 12
b1 + b2 + b3 = 0
String:
"pairwise": pairwise differences between estimates in each row.
"reference": differences between the estimates in each row and the estimate in the first row.
"sequential": difference between an estimate and the estimate in the next row.
"revpairwise", "revreference", "revsequential": inverse of the corresponding hypotheses, as described above.
See the Examples section below and the vignette: https://vincentarelbundock.github.io/marginaleffects/articles/hypothesis.html
- vcov
Type of uncertainty estimates to report (e.g., for robust standard errors). Acceptable values:
FALSE: Do not compute standard errors. This can speed up computation considerably.
TRUE: Unit-level standard errors using the default
vcov(model)
variance-covariance matrix.String which indicates the kind of uncertainty estimates to return.
Heteroskedasticity-consistent:
"HC"
,"HC0"
,"HC1"
,"HC2"
,"HC3"
,"HC4"
,"HC4m"
,"HC5"
. See?sandwich::vcovHC
Heteroskedasticity and autocorrelation consistent:
"HAC"
Mixed-Models degrees of freedom: "satterthwaite", "kenward-roger"
Other:
"NeweyWest"
,"KernHAC"
,"OPG"
. See thesandwich
package documentation.
One-sided formula which indicates the name of cluster variables (e.g.,
~unit_id
). This formula is passed to thecluster
argument of thesandwich::vcovCL
function.Square covariance matrix
Function which returns a covariance matrix (e.g.,
stats::vcov(model)
)
- conf_level
numeric value between 0 and 1. Confidence level to use to build a confidence interval.
- df
Degrees of freedom used to compute p values and confidence intervals. A single numeric value between 1 and
Inf
. Whendf
isInf
, the normal distribution is used. Whendf
is finite, thet
distribution is used. See insight::get_df for a convenient function to extract degrees of freedom. Ex:slopes(model, df = insight::get_df(model))
- equivalence
Numeric vector of length 2: bounds used for the two-one-sided test (TOST) of equivalence, and for the non-inferiority and non-superiority tests. See Details section below.
- FUN
NULL
or function.NULL
(default): hypothesis test on a model's coefficients, or on the quantities estimated by one of themarginaleffects
package functions.Function which accepts a model object and returns a numeric vector or a data.frame with two columns called
term
andestimate
. This argument can be useful when users want to conduct a hypothesis test on an arbitrary function of quantities held in a model object.
- ...
Additional arguments are passed to the
predict()
method supplied by the modeling package.These arguments are particularly useful for mixed-effects or bayesian models (see the online vignettes on themarginaleffects
website). Available arguments can vary from model to model, depending on the range of supported arguments by each modeling package. See the "Model-Specific Arguments" section of the?marginaleffects
documentation for a non-exhaustive list of available arguments.
Equivalence, Inferiority, Superiority
\(\theta\) is an estimate, \(\sigma_\theta\) its estimated standard error, and \([a, b]\) are the bounds of the interval supplied to the equivalence
argument.
Non-inferiority:
\(H_0\): \(\theta \leq a\)
\(H_1\): \(\theta > a\)
\(t=(\theta - a)/\sigma_\theta\)
p: Upper-tail probability
Non-superiority:
\(H_0\): \(\theta \geq b\)
\(H_1\): \(\theta < b\)
\(t=(\theta - b)/\sigma_\theta\)
p: Lower-tail probability
Equivalence: Two One-Sided Tests (TOST)
p: Maximum of the non-inferiority and non-superiority p values.
Thanks to Russell V. Lenth for the excellent emmeans
package and documentation which inspired this feature.
Examples
library(marginaleffects)
mod <- lm(mpg ~ hp + wt + factor(cyl), data = mtcars)
# When `FUN` and `hypotheses` are `NULL`, `hypotheses()` returns a data.frame of parameters
hypotheses(mod)
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> b1 35.8460 2.041 17.56 <0.001 31.8457 39.846319
#> b2 -0.0231 0.012 -1.93 0.0531 -0.0465 0.000306
#> b3 -3.1814 0.720 -4.42 <0.001 -4.5918 -1.771012
#> b4 -3.3590 1.402 -2.40 0.0166 -6.1062 -0.611803
#> b5 -3.1859 2.170 -1.47 0.1422 -7.4399 1.068169
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
# Test of equality between coefficients
hypotheses(mod, hypothesis = "hp = wt")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> hp = wt 3.16 0.72 4.39 <0.001 1.75 4.57
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
# Non-linear function
hypotheses(mod, hypothesis = "exp(hp + wt) = 0.1")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> exp(hp + wt) = 0.1 -0.0594 0.0292 -2.04 0.0418 -0.117 -0.0022
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
# Robust standard errors
hypotheses(mod, hypothesis = "hp = wt", vcov = "HC3")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> hp = wt 3.16 0.805 3.92 <0.001 1.58 4.74
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
# b1, b2, ... shortcuts can be used to identify the position of the
# parameters of interest in the output of FUN
hypotheses(mod, hypothesis = "b2 = b3")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> b2 = b3 3.16 0.72 4.39 <0.001 1.75 4.57
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
# term names with special characters have to be enclosed in backticks
hypotheses(mod, hypothesis = "`factor(cyl)6` = `factor(cyl)8`")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 %
#> `factor(cyl)6` = `factor(cyl)8` -0.173 1.65 -0.105 0.917 -3.41
#> 97.5 %
#> 3.07
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
mod2 <- lm(mpg ~ hp * drat, data = mtcars)
hypotheses(mod2, hypothesis = "`hp:drat` = drat")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> `hp:drat` = drat -6.08 2.89 -2.1 0.0357 -11.8 -0.405
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
# predictions(), comparisons(), and slopes()
mod <- glm(am ~ hp + mpg, data = mtcars, family = binomial)
cmp <- comparisons(mod, newdata = "mean")
hypotheses(cmp, hypothesis = "b1 = b2")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> b1=b2 -0.288 0.125 -2.31 0.0209 -0.532 -0.0435
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
mfx <- slopes(mod, newdata = "mean")
hypotheses(cmp, hypothesis = "b2 = 0.2")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> b2=0.2 0.101 0.131 0.774 0.439 -0.155 0.358
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
pre <- predictions(mod, newdata = datagrid(hp = 110, mpg = c(30, 35)))
hypotheses(pre, hypothesis = "b1 = b2")
#>
#> Term Estimate Pr(>|z|) 2.5 % 97.5 %
#> b1=b2 0.00184 0.0264 7.07e-06 0.324
#>
#> Columns: term, estimate, p.value, conf.low, conf.high
#>
# The `FUN` argument can be used to compute standard errors for fitted values
mod <- glm(am ~ hp + mpg, data = mtcars, family = binomial)
f <- function(x) predict(x, type = "link", newdata = mtcars)
p <- hypotheses(mod, FUN = f)
head(p)
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> b1 -1.098 0.716 -1.534 0.125 -2.50 0.305
#> b2 -1.098 0.716 -1.534 0.125 -2.50 0.305
#> b3 0.233 0.781 0.299 0.765 -1.30 1.764
#> b4 -0.595 0.647 -0.919 0.358 -1.86 0.674
#> b5 -0.418 0.647 -0.645 0.519 -1.69 0.851
#> b6 -5.026 2.195 -2.290 0.022 -9.33 -0.725
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
f <- function(x) predict(x, type = "response", newdata = mtcars)
p <- hypotheses(mod, FUN = f)
head(p)
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> b1 0.25005 0.1342 1.863 0.06243 -0.0130 0.5131
#> b2 0.25005 0.1342 1.863 0.06243 -0.0130 0.5131
#> b3 0.55803 0.1926 2.898 0.00376 0.1806 0.9355
#> b4 0.35560 0.1483 2.399 0.01646 0.0650 0.6462
#> b5 0.39710 0.1550 2.561 0.01043 0.0932 0.7010
#> b6 0.00652 0.0142 0.459 0.64635 -0.0213 0.0344
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
#>
# Equivalence, non-inferiority, and non-superiority tests
mod <- lm(mpg ~ hp + factor(gear), data = mtcars)
p <- predictions(mod, newdata = "median")
hypotheses(p, equivalence = c(17, 18))
#>
#> Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 % hp gear p (NonInf) p (NonSup)
#> 19.7 1 19.6 <0.001 17.7 21.6 123 3 0.00404 0.951
#> p (Equiv)
#> 0.951
#>
#> Columns: rowid, estimate, std.error, statistic, p.value, conf.low, conf.high, mpg, hp, gear, statistic.noninf, statistic.nonsup, p.value.noninf, p.value.nonsup, p.value.equiv
#>
mfx <- avg_slopes(mod, variables = "hp")
hypotheses(mfx, equivalence = c(-.1, .1))
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 % p (NonInf) p (NonSup)
#> hp -0.0669 0.011 -6.05 <0.001 -0.0885 -0.0452 0.00135 <0.001
#> p (Equiv)
#> 0.00135
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high, statistic.noninf, statistic.nonsup, p.value.noninf, p.value.nonsup, p.value.equiv
#>
cmp <- avg_comparisons(mod, variables = "gear", hypothesis = "pairwise")
hypotheses(cmp, equivalence = c(0, 10))
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 % p (NonInf)
#> (4 - 3) - (5 - 3) -3.94 2.05 -1.92 0.0543 -7.95 0.0727 0.973
#> p (NonSup) p (Equiv)
#> <0.001 0.973
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high, statistic.noninf, statistic.nonsup, p.value.noninf, p.value.nonsup, p.value.equiv
#>