If you do not like marginaleffects, you may want to consider one of the alternatives described below:

emmeans

The emmeans package is developed by Russell V. Lenth and colleagues. It is an extremely powerful package which focuses on the computation of marginal means, but which can also compute slopes.

In my (Vincent’s) biased opinion, the main benefits of marginaleffects over emmeans:

  • Support more model types.
  • Simpler and more intuitive user interface.
  • Easier to compute average marginal effects and unit-level marginal effects for whole datasets.
  • Easier to compute marginal effects (slopes) for custom grids and continuous regressors.
  • Common plots are easy with the plot_cap() and plot_cme() functions.

Many of marginaleffects advantages come down to subjective preferences over user interface. Readers are thus encouraged to try both packages to see which interface they prefer. Please keep in mind that emmeans offers several features which are not yet available in marginaleffects, including:

  • Equivalence tests.
  • Multiplicity adjustments.
  • Hypothesis tests on linear combinations of contrasts.
  • Some plots.

The Marginal Means Vignette includes side-by-side comparisons of emmeans and marginaleffects to compute marginal means. The rest of this section compares the syntax for contrasts and marginaleffects.

Contrasts

AFAICT, emmeans does not provide an easy way to compute unit-level contrasts for every row of the dataset used to fit our model. Therefore, the side-by-side syntax shown below will always include newdata=datagrid() to specify that we want to compute only one contrast: at the mean values of the regressors. In day-to-day practice with marginaleffects(), however, this extra argument would not be necessary.

Fit a model:

library(emmeans)
library(marginaleffects)

mod <- glm(vs ~ hp + factor(cyl), data = mtcars, family = binomial)

Response scale, reference groups:

emm <- emmeans(mod, specs = "cyl", regrid = "response")
contrast(emm, method = "trt.vs.ctrl1")
#>  contrast    estimate    SE  df z.ratio p.value
#>  cyl6 - cyl4   -0.222 0.394 Inf  -0.564  0.7842
#>  cyl8 - cyl4   -0.595 0.511 Inf  -1.163  0.4049
#> 
#> P value adjustment: dunnettx method for 2 tests

comparisons(mod, newdata = "mean")
#>   rowid     type term    contrast    comparison    std.error   statistic
#> 1     1 response   hp (x + 1) - x -1.557824e-10 6.803615e-07 -0.00022897
#> 2     1 response  cyl       6 - 4 -2.221851e-01 3.916435e-01 -0.56731461
#> 3     1 response  cyl       8 - 4 -5.946855e-01 5.097420e-01 -1.16664014
#>     p.value      conf.low    conf.high       hp cyl
#> 1 0.9998173 -1.333640e-06 1.333328e-06 146.6875   8
#> 2 0.5705005 -9.897921e-01 5.454220e-01 146.6875   8
#> 3 0.2433557 -1.593761e+00 4.043905e-01 146.6875   8

Link scale, pairwise contrasts:

emm <- emmeans(mod, specs = "cyl")
contrast(emm, method = "revpairwise")
#>  contrast    estimate      SE  df z.ratio p.value
#>  cyl6 - cyl4   -0.905    1.63 Inf  -0.555  0.8439
#>  cyl8 - cyl4  -19.542 4367.17 Inf  -0.004  1.0000
#>  cyl8 - cyl6  -18.637 4367.16 Inf  -0.004  1.0000
#> 
#> Results are given on the log odds ratio (not the response) scale. 
#> P value adjustment: tukey method for comparing a family of 3 estimates

comparisons(mod,
            type = "link",
            newdata = "mean",
            variables = list(cyl = "pairwise"))
#>   rowid type term contrast  comparison   std.error    statistic   p.value
#> 1     1 link  cyl    6 - 4  -0.9048741    1.630231 -0.555058785 0.5788545
#> 2     1 link  cyl    8 - 4 -19.5417566 4367.165924 -0.004474700 0.9964297
#> 3     1 link  cyl    8 - 6 -18.6368825 4367.165407 -0.004267501 0.9965950
#>       conf.low  conf.high       hp cyl
#> 1    -4.100068    2.29032 146.6875   8
#> 2 -8579.029682 8539.94617 146.6875   8
#> 3 -8578.123795 8540.85003 146.6875   8

Contrasts by group

Here is a slightly more complicated example with contrasts estimated by subgroup in a glmmTMB mixed effects model. First we estimate a model and compute pairwise contrasts by subgroup using emmeans:

library(dplyr)
library(glmmTMB)
library(emmeans)

dat <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/lme4/VerbAgg.csv")
dat$woman <- as.numeric(dat$Gender == "F")

mod <- glmmTMB(
    woman ~ btype * resp + situ + (1 + Anger | item),
    family = binomial,
    data = dat)

emmeans(mod, specs = "btype", by = "resp") |>
    contrast(method = "revpairwise", adjust = "none")
#> resp = no:
#>  contrast      estimate     SE   df t.ratio p.value
#>  scold - curse  -0.0152 0.1097 7571  -0.139  0.8898
#>  shout - curse  -0.2533 0.1022 7571  -2.479  0.0132
#>  shout - scold  -0.2381 0.0886 7571  -2.686  0.0072
#> 
#> resp = perhaps:
#>  contrast      estimate     SE   df t.ratio p.value
#>  scold - curse  -0.2393 0.1178 7571  -2.031  0.0423
#>  shout - curse  -0.0834 0.1330 7571  -0.627  0.5309
#>  shout - scold   0.1559 0.1358 7571   1.148  0.2511
#> 
#> resp = yes:
#>  contrast      estimate     SE   df t.ratio p.value
#>  scold - curse   0.0391 0.1292 7571   0.302  0.7624
#>  shout - curse   0.5802 0.1784 7571   3.252  0.0012
#>  shout - scold   0.5411 0.1888 7571   2.866  0.0042
#> 
#> Results are averaged over the levels of: situ 
#> Results are given on the log odds ratio (not the response) scale.

What did emmeans do to obtain these results? Roughly speaking:

  1. Create a prediction grid with one cell for each combination of categorical predictors in the model, and all numeric variables held at their means.
  2. Make adjusted predictions in each cell of the prediction grid.
  3. Take the average of those predictions (marginal means) for each combination of btype (focal variable) and resp (group by variable).
  4. Compute pairwise differences (contrasts) in marginal means across different levels of the focal variable btype.

In short, emmeans computes pairwise contrasts between marginal means, which are themselves averages of adjusted predictions. This is different from the default types of contrasts produced by comparisons(), which reports contrasts between adjusted predictions, without averaging across a pre-specified grid of predictors. What does comparisons() do instead?

Let newdata be a data frame supplied by the user (or the original data frame used to fit the model), then:

  1. Create a new data frame called newdata2, which is identical to newdata except that the focal variable is incremented by one level.
  2. Compute contrasts as the difference between adjusted predictions made on the two datasets:
    • predict(model, newdata = newdata2) - predict(model, newdata = newdata)

Although it is not idiomatic, we can use still use comparisons() to emulate the emmeans results. First, we create a prediction grid with one cell for each combination of categorical predictor in the model:

nd <- datagrid(
    model = mod,
    resp = dat$resp,
    situ = dat$situ,
    btype = dat$btype)
nrow(nd)
#> [1] 18

This grid has 18 rows, one for each combination of levels for the resp (3), situ (2), and btype (3) variables (3 * 2 * 3 = 18).

Then we compute pairwise contrasts over this grid:

cmp <- comparisons(mod,
    variables = list("btype" = "pairwise"),
    newdata = nd,
    type = "link")
nrow(cmp)
#> [1] 54

There are 3 pairwise contrasts, corresponding to the 3 pairwise comparisons possible between the 3 levels of the focal variable btype: scold-curse, shout-scold, shout-curse. The comparisons() function estimates those 3 contrasts for each row of newdata, so we get \(18 \times 3 = 54\) rows.

Finally, we wanted contrasts for each subgroup of the resp variable, so we average out the contrasts using summary() and the by argument:

cmp |> summary(by = "resp")
#> Average contrasts 
#>      resp         btype   Effect Std. Error z value Pr(>|z|)     2.5 %  97.5 %
#> 1      no scold - curse -0.07180    0.06882 -1.0434 0.296785 -0.206685 0.06308
#> 2 perhaps shout - curse  0.08117    0.08172  0.9933 0.320580 -0.079003 0.24135
#> 3     yes shout - scold  0.15298    0.08299  1.8432 0.065296 -0.009688 0.31564
#> 
#> Model type:  glmmTMB 
#> Prediction type:  link

These results are identical to those produced by emmeans (except for \(t\) vs. \(z\)).

Marginal Effects

AFAICT, emmeans::emtrends makes it easier to compute marginal effects for a few user-specified values than for large grids or for the full original dataset.

Response scale, user-specified values:

mod <- glm(vs ~ hp + factor(cyl), data = mtcars, family = binomial)

emtrends(mod, ~hp, "hp", regrid = "response", at = list(cyl = 4))
#>   hp hp.trend    SE  df asymp.LCL asymp.UCL
#>  147 -0.00786 0.011 Inf   -0.0294    0.0137
#> 
#> Confidence level used: 0.95

marginaleffects(mod, newdata = datagrid(cyl = 4))
#>   rowid     type term contrast         dydx  std.error  statistic   p.value
#> 1     1 response   hp    dY/dX -0.007852329 0.01110851 -0.7068751 0.4796441
#> 2     1 response  cyl    6 - 4 -0.222185055 0.39164346 -0.5673146 0.5705005
#> 3     1 response  cyl    8 - 4 -0.594685481 0.50974200 -1.1666401 0.2433557
#>      conf.low  conf.high       hp cyl
#> 1 -0.02962461 0.01391995 146.6875   4
#> 2 -0.98979213 0.54542202 146.6875   4
#> 3 -1.59376144 0.40439048 146.6875   4

Link scale, user-specified values:

emtrends(mod, ~hp, "hp", at = list(cyl = 4))
#>   hp hp.trend     SE  df asymp.LCL asymp.UCL
#>  147  -0.0326 0.0339 Inf    -0.099    0.0338
#> 
#> Confidence level used: 0.95

marginaleffects(mod, type = "link", newdata = datagrid(cyl = 4))
#>   rowid type term contrast         dydx    std.error  statistic   p.value
#> 1     1 link   hp    dY/dX  -0.03257475 3.388105e-02 -0.9614446 0.3363287
#> 2     1 link  cyl    6 - 4  -0.90487411 1.630231e+00 -0.5550588 0.5788545
#> 3     1 link  cyl    8 - 4 -19.54175662 4.367166e+03 -0.0044747 0.9964297
#>        conf.low    conf.high       hp cyl
#> 1 -9.898038e-02 3.383088e-02 146.6875   4
#> 2 -4.100068e+00 2.290320e+00 146.6875   4
#> 3 -8.579030e+03 8.539946e+03 146.6875   4

margins and prediction

The margins and prediction packages for R were designed by Thomas Leeper to emulate the behavior of the margins command from Stata. These packages are trailblazers and strongly influenced the development of marginaleffects. The main benefits of marginaleffects over these packages are:

  • Support more model types
  • Faster
  • Memory efficient
  • Plots using ggplot2 instead of Base R
  • More extensive test suite
  • Active development

The syntax of the two packages is very similar.

Average Marginal Effects

library(margins)
library(marginaleffects)

mod <- lm(mpg ~ cyl + hp + wt, data = mtcars)

mar <- margins(mod)
summary(mar)
#>  factor     AME     SE       z      p   lower   upper
#>     cyl -0.9416 0.5509 -1.7092 0.0874 -2.0214  0.1382
#>      hp -0.0180 0.0119 -1.5188 0.1288 -0.0413  0.0052
#>      wt -3.1670 0.7406 -4.2764 0.0000 -4.6185 -1.7155

mfx <- marginaleffects(mod)
summary(mfx)
#> Average marginal effects 
#>   Term   Effect Std. Error z value   Pr(>|z|)    2.5 %    97.5 %
#> 1  cyl -0.94162    0.55092  -1.709   0.087417 -2.02139  0.138159
#> 2   hp -0.01804    0.01188  -1.519   0.128803 -0.04132  0.005239
#> 3   wt -3.16697    0.74058  -4.276 1.8997e-05 -4.61848 -1.715471
#> 
#> Model type:  lm 
#> Prediction type:  response

Individual-Level Marginal Effects

Marginal effects in a user-specified data frame:

head(data.frame(mar))
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb   fitted se.fitted
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 22.82043 0.6876212
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 22.01285 0.6056817
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 25.96040 0.7349593
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1 20.93608 0.5800910
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 17.16780 0.8322986
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 20.25036 0.6638322
#>     dydx_cyl    dydx_hp   dydx_wt Var_dydx_cyl  Var_dydx_hp Var_dydx_wt
#> 1 -0.9416168 -0.0180381 -3.166973    0.3035104 0.0001410451   0.5484521
#> 2 -0.9416168 -0.0180381 -3.166973    0.3035104 0.0001410451   0.5484521
#> 3 -0.9416168 -0.0180381 -3.166973    0.3035104 0.0001410451   0.5484521
#> 4 -0.9416168 -0.0180381 -3.166973    0.3035104 0.0001410451   0.5484521
#> 5 -0.9416168 -0.0180381 -3.166973    0.3035104 0.0001410451   0.5484521
#> 6 -0.9416168 -0.0180381 -3.166973    0.3035104 0.0001410451   0.5484521
#>   X_weights X_at_number
#> 1        NA           1
#> 2        NA           1
#> 3        NA           1
#> 4        NA           1
#> 5        NA           1
#> 6        NA           1

head(mfx)
#>   rowid     type term       dydx std.error statistic    p.value  conf.low
#> 1     1 response  cyl -0.9416168 0.5509163 -1.709183 0.08741706 -2.021393
#> 2     2 response  cyl -0.9416168 0.5509163 -1.709183 0.08741706 -2.021393
#> 3     3 response  cyl -0.9416168 0.5509164 -1.709183 0.08741712 -2.021393
#> 4     4 response  cyl -0.9416168 0.5509163 -1.709183 0.08741706 -2.021393
#> 5     5 response  cyl -0.9416168 0.5509164 -1.709183 0.08741710 -2.021393
#> 6     6 response  cyl -0.9416168 0.5509163 -1.709183 0.08741706 -2.021393
#>   conf.high  mpg cyl  hp    wt
#> 1 0.1381594 21.0   6 110 2.620
#> 2 0.1381594 21.0   6 110 2.875
#> 3 0.1381595 22.8   4  93 2.320
#> 4 0.1381594 21.4   6 110 3.215
#> 5 0.1381595 18.7   8 175 3.440
#> 6 0.1381594 18.1   6 105 3.460
nd <- data.frame(cyl = 4, hp = 110, wt = 3)

Marginal Effects at the Mean

mar <- margins(mod, data = data.frame(mean_or_mode(mtcars)), unit_ses = TRUE)
data.frame(mar)
#>        mpg    cyl     disp       hp     drat      wt     qsec     vs      am
#> 1 20.09062 6.1875 230.7219 146.6875 3.596563 3.21725 17.84875 0.4375 0.40625
#>     gear   carb   fitted se.fitted   dydx_cyl    dydx_hp   dydx_wt Var_dydx_cyl
#> 1 3.6875 2.8125 20.09062 0.4439832 -0.9416168 -0.0180381 -3.166973    0.3035013
#>    Var_dydx_hp Var_dydx_wt SE_dydx_cyl SE_dydx_hp SE_dydx_wt X_weights
#> 1 0.0001410453     0.54846   0.5509096 0.01187625  0.7405808        NA
#>   X_at_number
#> 1           1

marginaleffects(mod, newdata = "mean")
#>   rowid     type term       dydx  std.error statistic      p.value    conf.low
#> 1     1 response  cyl -0.9416168 0.55091633 -1.709183 8.741706e-02 -2.02139298
#> 2     1 response   hp -0.0180381 0.01187625 -1.518838 1.288032e-01 -0.04131512
#> 3     1 response   wt -3.1669731 0.74057579 -4.276366 1.899688e-05 -4.61847499
#>     conf.high    cyl       hp      wt
#> 1  0.13815935 6.1875 146.6875 3.21725
#> 2  0.00523892 6.1875 146.6875 3.21725
#> 3 -1.71547124 6.1875 146.6875 3.21725

Counterfactual Average Marginal Effects

The at argument of the margins package emulates Stata by fixing the values of some variables at user-specified values, and by replicating the full dataset several times for each combination of the supplied values (see the Stata section below). For example, if the dataset includes 32 rows and the user calls at=list(cyl=c(4, 6)), margins will compute 64 unit-level marginal effects estimates:

dat <- mtcars
dat$cyl <- factor(dat$cyl)
mod <- lm(mpg ~ cyl * hp + wt, data = mtcars)

mar <- margins(mod, at = list(cyl = c(4, 6, 8)))
summary(mar)
#>  factor    cyl     AME     SE       z      p   lower   upper
#>     cyl 4.0000  0.0381 0.6000  0.0636 0.9493 -1.1378  1.2141
#>     cyl 6.0000  0.0381 0.5999  0.0636 0.9493 -1.1376  1.2139
#>     cyl 8.0000  0.0381 0.5999  0.0636 0.9493 -1.1376  1.2139
#>      hp 4.0000 -0.0878 0.0267 -3.2937 0.0010 -0.1400 -0.0355
#>      hp 6.0000 -0.0499 0.0154 -3.2397 0.0012 -0.0800 -0.0197
#>      hp 8.0000 -0.0120 0.0108 -1.1065 0.2685 -0.0332  0.0092
#>      wt 4.0000 -3.1198 0.6613 -4.7175 0.0000 -4.4160 -1.8236
#>      wt 6.0000 -3.1198 0.6613 -4.7175 0.0000 -4.4160 -1.8236
#>      wt 8.0000 -3.1198 0.6613 -4.7175 0.0000 -4.4160 -1.8236

mfx <- marginaleffects(mod, newdata = datagrid(cyl = c(4, 6, 8), grid_type = "counterfactual"))
summary(mfx, by = "cyl")
#> Average marginal effects 
#>   Term cyl   Effect Std. Error  z value   Pr(>|z|)    2.5 %    97.5 %
#> 1  cyl   4  0.03814    0.59989  0.06357 0.94930949 -1.13762  1.213899
#> 2  cyl   6  0.03814    0.59989  0.06357 0.94930949 -1.13762  1.213899
#> 3  cyl   8  0.03814    0.59989  0.06357 0.94930949 -1.13762  1.213899
#> 4   hp   4 -0.08778    0.02665 -3.29366 0.00098892 -0.14002 -0.035545
#> 5   hp   6 -0.04987    0.01539 -3.23974 0.00119639 -0.08004 -0.019701
#> 6   hp   8 -0.01197    0.01081 -1.10649 0.26851517 -0.03316  0.009229
#> 7   wt   4 -3.11981    0.66132 -4.71754 2.3871e-06 -4.41598 -1.823648
#> 8   wt   6 -3.11981    0.66132 -4.71754 2.3871e-06 -4.41598 -1.823648
#> 9   wt   8 -3.11981    0.66132 -4.71754 2.3871e-06 -4.41598 -1.823648
#> 
#> Model type:  lm 
#> Prediction type:  response

Adjusted Predictions

The syntax to compute adjusted predictions using the predictions package or marginaleffects is very similar:

prediction::prediction(mod) |> head()
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb   fitted se.fitted
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 21.90488 0.6927034
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 21.10933 0.6266557
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 25.64753 0.6652076
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1 20.04859 0.6041400
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 17.25445 0.7436172
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 19.53360 0.6436862

marginaleffects::predictions(mod) |> head()
#>   rowid     type predicted std.error statistic       p.value conf.low conf.high
#> 1     1 response  21.90488 0.6927034  31.62231 1.822647e-219 20.54721  23.26256
#> 2     2 response  21.10933 0.6266557  33.68569 9.366966e-249 19.88111  22.33755
#> 3     3 response  25.64753 0.6652076  38.55568  0.000000e+00 24.34375  26.95131
#> 4     4 response  20.04859 0.6041400  33.18534 1.751939e-241 18.86450  21.23268
#> 5     5 response  17.25445 0.7436172  23.20340 4.207206e-119 15.79698  18.71191
#> 6     6 response  19.53360 0.6436862  30.34646 2.797513e-202 18.27200  20.79520
#>    mpg cyl  hp    wt
#> 1 21.0   6 110 2.620
#> 2 21.0   6 110 2.875
#> 3 22.8   4  93 2.320
#> 4 21.4   6 110 3.215
#> 5 18.7   8 175 3.440
#> 6 18.1   6 105 3.460

Stata

Stata is a good but expensive software package for statistical analysis. It is published by StataCorp LLC. This section compares Stata’s margins command to marginaleffects.

The results produced by marginaleffects are extensively tested against Stata. See the test suite for a list of the dozens of models where we compared estimates and standard errors.

Average Marginal Effect (AMEs)

Marginal effects are unit-level quantities. To compute “average marginal effects”, we first calculate marginal effects for each observation in a dataset. Then, we take the mean of those unit-level marginal effects.

Stata

Both Stata’s margins command and the marginaleffects function can calculate average marginal effects (AMEs). Here is an example showing how to estimate AMEs in Stata:

quietly reg mpg cyl hp wt
margins, dydx(*)

Average marginal effects                        Number of obs     =         32
Model VCE    : OLS
 
Expression   : Linear prediction, predict()
dy/dx w.r.t. : cyl hp wt
 
------------------------------------------------------------------------------
    |            Delta-method
    |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
------------------------------------------------------------------------------
cyl |  -.9416168   .5509164    -1.71   0.098    -2.070118    .1868842
 hp |  -.0180381   .0118762    -1.52   0.140    -.0423655    .0062893
 wt |  -3.166973   .7405759    -4.28   0.000    -4.683974   -1.649972
------------------------------------------------------------------------------

marginaleffects

The same results can be obtained with marginaleffects() and summary() like this:

library("marginaleffects")
mod <- lm(mpg ~ cyl + hp + wt, data = mtcars)
mfx <- marginaleffects(mod)
summary(mfx)
#> Average marginal effects 
#>   Term   Effect Std. Error z value   Pr(>|z|)    2.5 %    97.5 %
#> 1  cyl -0.94162    0.55092  -1.709   0.087417 -2.02139  0.138159
#> 2   hp -0.01804    0.01188  -1.519   0.128803 -0.04132  0.005239
#> 3   wt -3.16697    0.74058  -4.276 1.8997e-05 -4.61848 -1.715471
#> 
#> Model type:  lm 
#> Prediction type:  response

Note that Stata reports t statistics while marginaleffects reports Z. This produces slightly different p-values because this model has low degrees of freedom: mtcars only has 32 rows

Counterfactual Marginal Effects

A “counterfactual marginal effect” is a special quantity obtained by replicating a dataset while fixing some regressor to user-defined values.

Concretely, Stata computes counterfactual marginal effects in 3 steps:

  1. Duplicate the whole dataset 3 times and sets the values of cyl to the three specified values in each of those subsets.
  2. Calculate marginal effects for each observation in that large grid.
  3. Take the average of marginal effects for each value of the variable of interest.

Stata

With the at argument, Stata’s margins command estimates average counterfactual marginal effects. Here is an example:

quietly reg mpg i.cyl##c.hp wt
margins, dydx(hp) at(cyl = (4 6 8))

Average marginal effects                        Number of obs     =         32
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : hp

1._at        : cyl             =           4

2._at        : cyl             =           6

3._at        : cyl             =           8

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
hp           |
         _at |
          1  |   -.099466   .0348665    -2.85   0.009    -.1712749   -.0276571
          2  |  -.0213768    .038822    -0.55   0.587    -.1013323    .0585787
          3  |   -.013441   .0125138    -1.07   0.293    -.0392137    .0123317
------------------------------------------------------------------------------

marginaleffects

You can estimate average counterfactual marginal effects with marginaleffects() by, first, setting the grid_type argument of datagrid() to "counterfactual" and, second, by telling the summary function that you want to average within groups:

mod <- lm(mpg ~ as.factor(cyl) * hp + wt, data = mtcars)

mfx <- marginaleffects(
    mod,
    variables = "hp",
    newdata = datagrid(cyl = c(4, 6, 8),
                       grid_type = "counterfactual"))

summary(mfx, by = "cyl")
#> Average marginal effects 
#>   Term cyl   Effect Std. Error z value Pr(>|z|)    2.5 %   97.5 %
#> 1   hp   4 -0.09947    0.03487 -2.8528 0.004334 -0.16780 -0.03113
#> 2   hp   6 -0.02138    0.03882 -0.5506 0.581884 -0.09747  0.05471
#> 3   hp   8 -0.01344    0.01251 -1.0741 0.282780 -0.03797  0.01109
#> 
#> Model type:  lm 
#> Prediction type:  response

This is equivalent to taking the group-wise mean of observation-level marginal effects:

aggregate(dydx ~ term + cyl, data = mfx, FUN = mean)
#>   term cyl        dydx
#> 1   hp   4 -0.09946598
#> 2   hp   6 -0.02137679
#> 3   hp   8 -0.01344103

Note that following Stata, the standard errors for group-averaged marginal effects are computed by taking the “Jacobian at the mean:”

J <- attr(mfx, "jacobian")
J_mean <- aggregate(J, by = list(mfx$cyl), FUN = mean)
J_mean <- as.matrix(J_mean[, 2:ncol(J_mean)])
sqrt(diag(J_mean %*% vcov(mod) %*% t(J_mean)))
#> [1] 0.03486650 0.03882203 0.01251382

Average Counterfactual Adjusted Predictions

Stata

Just like Stata’s margins command computes average counterfactual marginal effects, it can also estimate average counterfactual adjusted predictions.

Here is an example:

quietly reg mpg i.cyl##c.hp wt
margins, at(cyl = (4 6 8))

Predictive margins                              Number of obs     =         32
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : cyl             =           4

2._at        : cyl             =           6

3._at        : cyl             =           8

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   17.44233   2.372914     7.35   0.000     12.55522    22.32944
          2  |    18.9149   1.291483    14.65   0.000     16.25505    21.57476
          3  |   18.33318   1.123874    16.31   0.000     16.01852    20.64785
------------------------------------------------------------------------------

Again, this is what Stata does in the background:

  1. It duplicates the whole dataset 3 times and sets the values of cyl to the three specified values in each of those subsets.
  2. It calculates predictions for that large grid.
  3. It takes the average prediction for each value of cyl.

In other words, average counterfactual adjusted predictions as implemented by Stata are a hybrid between predictions at the observed values (the default in marginaleffects::predictions) and predictions at representative values.

marginaleffects

You can estimate average counterfactual adjusted predictions with predictions() by, first, setting the grid_type argument of datagrid() to "counterfactual" and, second, by averaging the predictions using the by argument of summary(), or a manual function like dplyr::summarise().

mod <- lm(mpg ~ as.factor(cyl) * hp + wt, data = mtcars)

pred <- predictions(
    mod,
    newdata = datagrid(cyl = c(4, 6, 8),
                       grid_type = "counterfactual"))
summary(pred, by = "cyl")
#> Average Adjusted Predictions 
#>   cyl Predicted Std. Error z value   Pr(>|z|) CI low CI high
#> 1   4     17.44      2.373   7.351 1.9733e-13  12.79   22.09
#> 2   6     18.91      1.291  14.646 < 2.22e-16  16.38   21.45
#> 3   8     18.33      1.124  16.312 < 2.22e-16  16.13   20.54
#> 
#> Model type:  lm 
#> Prediction type:  response

pred %>%
    group_by(cyl) %>%
    summarize(AAP = mean(predicted))
#> # A tibble: 3 × 2
#>   cyl     AAP
#>   <fct> <dbl>
#> 1 4      17.4
#> 2 6      18.9
#> 3 8      18.3

brmsmargins

The brmsmargins package is developed by Joshua Wiley:

This package has functions to calculate marginal effects from brms models ( http://paul-buerkner.github.io/brms/ ). A central motivator is to calculate average marginal effects (AMEs) for continuous and discrete predictors in fixed effects only and mixed effects regression models including location scale models.

The two main advantages of brmsmargins over marginaleffects are:

  1. Ability to compute “Marginal Coefficients” following the method described in Hedeker et al (2012).
  2. Ability to “integrate out random effects.”

The main advantages of marginaleffects over brmsmargins are:

  1. Support for 60+ model types, rather than just the brms package.
  2. Simpler user interface (subjective).
  3. At the time of writing (2022-05-25) brmsmargins did not support certain brms models such as those with multivariate or multinomial outcomes. It also did not support custom outcome transformations.

The rest of this section presents side-by-side replications of some of the analyses from the brmsmargins vignettes in order to show highlight parallels and differences in syntax.

Marginal Effects for Fixed Effects Models

AMEs for Logistic Regression

Estimate a logistic regression model with brms:

library(brms)
library(brmsmargins)
library(marginaleffects)
library(data.table)
library(withr)
h <- 1e-4

void <- capture.output(
    bayes.logistic <- brm(
      vs ~ am + mpg, data = mtcars,
      family = "bernoulli", seed = 1234,
      silent = 2, refresh = 0,
      chains = 4L, cores = 4L)
)

Compute AMEs manually:

d1 <- d2 <- mtcars
d2$mpg <- d2$mpg + h
p1 <- posterior_epred(bayes.logistic, newdata = d1)
p2 <- posterior_epred(bayes.logistic, newdata = d2)
m <- (p2 - p1) / h
quantile(rowMeans(m), c(.5, .025, .975))
#>        50%       2.5%      97.5% 
#> 0.07014014 0.05437810 0.09159280

Compute AMEs with brmsmargins:

bm <- brmsmargins(
  bayes.logistic,
  add = data.frame(mpg = c(0, 0 + h)),
  contrasts = cbind("AME MPG" = c(-1 / h, 1 / h)),
  CI = 0.95,
  CIType = "ETI")
data.frame(bm$ContrastSummary)
#>            M        Mdn        LL        UL PercentROPE PercentMID   CI CIType
#> 1 0.07118446 0.07014014 0.0543781 0.0915928          NA         NA 0.95    ETI
#>   ROPE  MID   Label
#> 1 <NA> <NA> AME MPG

Compute AMEs using marginaleffects:

mfx <- marginaleffects(bayes.logistic) 
summary(mfx)
#> Average marginal effects 
#>   Term   Effect    2.5 %   97.5 %
#> 1   am -0.30976 -0.52182 -0.07808
#> 2  mpg  0.07119  0.05439  0.09158
#> 
#> Model type:  brmsfit 
#> Prediction type:  response

The mpg element of the Effect column from marginaleffects matches the the M column of the output from brmsmargins.

Marginal Effects for Mixed Effects Models

Estimate a mixed effects logistic regression model with brms:

d <- withr::with_seed(
  seed = 12345, code = {
    nGroups <- 100
    nObs <- 20
    theta.location <- matrix(rnorm(nGroups * 2), nrow = nGroups, ncol = 2)
    theta.location[, 1] <- theta.location[, 1] - mean(theta.location[, 1])
    theta.location[, 2] <- theta.location[, 2] - mean(theta.location[, 2])
    theta.location[, 1] <- theta.location[, 1] / sd(theta.location[, 1])
    theta.location[, 2] <- theta.location[, 2] / sd(theta.location[, 2])
    theta.location <- theta.location %*% chol(matrix(c(1.5, -.25, -.25, .5^2), 2))
    theta.location[, 1] <- theta.location[, 1] - 2.5
    theta.location[, 2] <- theta.location[, 2] + 1
    d <- data.table(
      x = rep(rep(0:1, each = nObs / 2), times = nGroups))
    d[, ID := rep(seq_len(nGroups), each = nObs)]

    for (i in seq_len(nGroups)) {
      d[ID == i, y := rbinom(
        n = nObs,
        size = 1,
        prob = plogis(theta.location[i, 1] + theta.location[i, 2] * x))
        ]
    }
    copy(d)
  })

void <- capture.output(
    mlogit <- brms::brm(
      y ~ 1 + x + (1 + x | ID), family = "bernoulli",
      data = d, seed = 1234,
      silent = 2, refresh = 0,
      chains = 4L, cores = 4L)
)
#> Warning: There were 61 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess

AME: Including Random Effects

bm <- brmsmargins(
  mlogit,
  add = data.frame(x = c(0, h)),
  contrasts = cbind("AME x" = c(-1 / h, 1 / h)),
  effects = "includeRE",
  CI = .95,
  CIType = "ETI")
data.frame(bm$ContrastSummary)
#>           M       Mdn         LL        UL PercentROPE PercentMID   CI CIType
#> 1 0.1113512 0.1114643 0.08037199 0.1419889          NA         NA 0.95    ETI
#>   ROPE  MID Label
#> 1 <NA> <NA> AME x

mfx <- marginaleffects(mlogit)
summary(mfx)
#> Average marginal effects 
#>   Term Effect   2.5 % 97.5 %
#> 1    x 0.1114 0.08037  0.142
#> 
#> Model type:  brmsfit 
#> Prediction type:  response

AME: Fixed Effects Only (Grand Mean)

bm <- brmsmargins(
  mlogit,
  add = data.frame(x = c(0, h)),
  contrasts = cbind("AME x" = c(-1 / h, 1 / h)),
  effects = "fixedonly",
  CI = .95,
  CIType = "ETI")
data.frame(bm$ContrastSummary)
#>           M       Mdn         LL        UL PercentROPE PercentMID   CI CIType
#> 1 0.1038071 0.1032705 0.06328158 0.1485747          NA         NA 0.95    ETI
#>   ROPE  MID Label
#> 1 <NA> <NA> AME x

mfx <- marginaleffects(mlogit, re_formula = NA)
summary(mfx)
#> Average marginal effects 
#>   Term Effect   2.5 % 97.5 %
#> 1    x 0.1038 0.06328 0.1486
#> 
#> Model type:  brmsfit 
#> Prediction type:  response

Marginal Effects for Location Scale Models

AMEs for Fixed Effects Location Scale Models

Estimate a fixed effects location scale model with brms:

d <- withr::with_seed(
  seed = 12345, code = {
    nObs <- 1000L
    d <- data.table(
      grp = rep(0:1, each = nObs / 2L),
      x = rnorm(nObs, mean = 0, sd = 0.25))
    d[, y := rnorm(nObs,
                   mean = x + grp,
                   sd = exp(1 + x + grp))]
    copy(d)
  })

void <- capture.output(
    ls.fe <- brm(bf(
      y ~ 1 + x + grp,
      sigma ~ 1 + x + grp),
      family = "gaussian",
      data = d, seed = 1234,
      silent = 2, refresh = 0,
      chains = 4L, cores = 4L)
)

Fixed effects only

bm <- brmsmargins(
  ls.fe,
  add = data.frame(x = c(0, h)),
  contrasts = cbind("AME x" = c(-1 / h, 1 / h)),
  CI = 0.95, CIType = "ETI",
  effects = "fixedonly")
data.frame(bm$ContrastSummary)
#>          M     Mdn        LL       UL PercentROPE PercentMID   CI CIType ROPE
#> 1 1.625042 1.63264 0.7558805 2.497346          NA         NA 0.95    ETI <NA>
#>    MID Label
#> 1 <NA> AME x

mfx <- marginaleffects(ls.fe, re_formula = NA)
summary(mfx)
#> Average marginal effects 
#>   Term Effect  2.5 % 97.5 %
#> 1    x  1.625 0.7559  2.497
#> 2  grp  1.011 0.3482  1.678
#> 
#> Model type:  brmsfit 
#> Prediction type:  response

Discrete change and distributional parameter (dpar)

Compute the contrast between adjusted predictions on the sigma parameter, when grp=0 and grp=1:

bm <- brmsmargins(
  ls.fe,
  at = data.frame(grp = c(0, 1)),
  contrasts = cbind("AME grp" = c(-1, 1)),
  CI = 0.95, CIType = "ETI", dpar = "sigma",
  effects = "fixedonly")
data.frame(bm$ContrastSummary)
#>          M      Mdn       LL       UL PercentROPE PercentMID   CI CIType ROPE
#> 1 4.901384 4.895236 4.404996 5.417512          NA         NA 0.95    ETI <NA>
#>    MID   Label
#> 1 <NA> AME grp

In marginaleffects we use the comparisons() function and the variables argument:

cmp <- comparisons(
  ls.fe,
  variables = list(grp = 0:1),
  dpar = "sigma")
summary(cmp)
#> Average contrasts 
#>   Term Contrast Effect 2.5 % 97.5 %
#> 1  grp    1 - 0  4.901 4.405  5.418
#> 
#> Model type:  brmsfit 
#> Prediction type:  response

Marginal effect (continuous) on sigma

bm <- brmsmargins(
  ls.fe,
  add = data.frame(x = c(0, h)),
  contrasts = cbind("AME x" = c(-1 / h, 1 / h)),
  CI = 0.95, CIType = "ETI", dpar = "sigma",
  effects = "fixedonly")
data.frame(bm$ContrastSummary)
#>          M      Mdn       LL       UL PercentROPE PercentMID   CI CIType ROPE
#> 1 4.455297 4.443977 3.500513 5.449401          NA         NA 0.95    ETI <NA>
#>    MID Label
#> 1 <NA> AME x

mfx <- marginaleffects(ls.fe, dpar = "sigma", re_formula = NA)
summary(mfx)
#> Average marginal effects 
#>   Term Effect 2.5 % 97.5 %
#> 1    x  4.455 3.501  5.450
#> 2  grp  5.293 4.694  5.926
#> 
#> Model type:  brmsfit 
#> Prediction type:  response

effects

The effects package was created by John Fox and colleagues.

  • marginaleffects supports 30+ more model types than effects.
  • effects focuses on the computation of “adjusted predictions.” The plots it produces are roughly equivalent to the ones produced by the plot_cap and predictions functions in marginaleffects.
  • effects does not appear support marginal effects (slopes), marginal means, or contrasts
  • effects uses Base graphics whereas marginaleffects uses ggplot2
  • effects includes a lot of very powerful options to customize plots. In contrast, marginaleffects produces objects which can be customized by chaining ggplot2 functions. Users can also call plot_cap(model, draw=FALSE) to create a prediction grid, and then work the raw data directly to create the plot they need

effects offers several options which are not currently available in marginaleffects, including:

  • Partial residuals plots
  • Many types of ways to plot adjusted predictions: package vignette

modelbased

The modelbased package is developed by the easystats team.

This section is incomplete; contributions are welcome.

  • Wrapper around emmeans to compute marginal means and marginal effects.
  • Powerful functions to create beautiful plots.

ggeffects

The ggeffects package is developed by Daniel Lüdecke.

This section is incomplete; contributions are welcome.

  • Wrapper around emmeans to compute marginal means.