In a previous vignette, we introduced the “marginal effect” as a partial derivative. Since derivatives are only properly defined for continuous variables, we cannot use them to interpret the effects of changes in categorical variables. For this, we turn to contrasts between Adjusted predictions. In the context of this package, a “Contrast” is defined as:

The difference between two adjusted predictions, calculated for meaningfully different regressor values (e.g., College graduates vs. Others).

# Simple contrasts

Consider a simple model with a logical and a factor variable:

library(marginaleffects)

tmp <- mtcars
tmp$am <- as.logical(tmp$am)
mod <- lm(mpg ~ am + factor(cyl), tmp)

The marginaleffects function automatically computes contrasts for each level of the categorical variables, relative to the baseline category (FALSE for logicals, and the reference level for factors), while holding all other values at their mode or mean:

mfx <- marginaleffects(mod)
summary(mfx)
#> Average marginal effects
#>       type         Term  Effect Std. Error z value   Pr(>|z|)     2.5 % 97.5 %
#> 1 response       amTRUE   2.560      1.298   1.973    0.04851   0.01675  5.103
#> 2 response factor(cyl)6  -6.156      1.536  -4.009 6.1077e-05  -9.16608 -3.146
#> 3 response factor(cyl)8 -10.068      1.452  -6.933 4.1147e-12 -12.91359 -7.222
#>
#> Model type:  lm
#> Prediction type:  response

The summary printed above says that moving from the reference category 4 to the level 6 on the cyl factor variable is associated with a change of -6.156 in the adjusted prediction. Similarly, the contrast from FALSE to TRUE on the am variable is equal to 2.560.

We can obtain the same results using the emmeans package:

library(emmeans)
emm <- emmeans(mod, specs = "cyl")
contrast(emm, method = "revpairwise")
#>  contrast estimate   SE df t.ratio p.value
#>  6 - 4       -6.16 1.54 28  -4.009  0.0012
#>  8 - 4      -10.07 1.45 28  -6.933  <.0001
#>  8 - 6       -3.91 1.47 28  -2.660  0.0331
#>
#> Results are averaged over the levels of: am
#> P value adjustment: tukey method for comparing a family of 3 estimates

emm <- emmeans(mod, specs = "am")
contrast(emm, method = "revpairwise")
#>  contrast     estimate  SE df t.ratio p.value
#>  TRUE - FALSE     2.56 1.3 28   1.973  0.0585
#>
#> Results are averaged over the levels of: cyl

# Contrasts with interactions

In models with multiplicative interactions, the contrasts of a categorical variable will depend on the values of the interacted variable:

mod_int <- lm(mpg ~ am * factor(cyl), tmp)

We can now use the newdata argument of the marginaleffects function to compute contrasts for different values of the other regressors. As in the marginal effects vignette, the typical function can be handy. Since we only care about the logical am contrast, we use the variables to indicate the subset of results to report:

marginaleffects(mod_int, newdata = typical(cyl = tmp\$cyl), variables = "am")
#>   rowid     type   term     dydx std.error    am cyl predicted
#> 1     1 response amTRUE 1.441667  2.315925 FALSE   6    19.125
#> 2     2 response amTRUE 5.175000  2.052848 FALSE   4    22.900
#> 3     3 response amTRUE 0.350000  2.315925 FALSE   8    15.050

Once again, we obtain the same results with emmeans:

emm <- emmeans(mod_int, specs = "am", by = "cyl")
contrast(emm, method = "revpairwise")
#> cyl = 4:
#>  contrast     estimate   SE df t.ratio p.value
#>  TRUE - FALSE     5.17 2.05 26   2.521  0.0182
#>
#> cyl = 6:
#>  contrast     estimate   SE df t.ratio p.value
#>  TRUE - FALSE     1.44 2.32 26   0.623  0.5390
#>
#> cyl = 8:
#>  contrast     estimate   SE df t.ratio p.value
#>  TRUE - FALSE     0.35 2.32 26   0.151  0.8810

# Complex queries

As described above, the marginaleffects package includes limited support to compute contrasts. Users who require more powerful features are encouraged to consider alternative packages such as emmeans, modelbased, or ggeffects. These packages offer useful features such as automatic back-transforms, p value correction for multiple comparisons, and more.