## What to do when `marginaleffects`

is slow?

Some options:

- Compute marginal effects and contrasts at the mean (or other representative value) instead of all observed rows of the original dataset: Use the
`newdata`

argument and the`datagrid()`

function. - Compute marginal effects for a subset of variables, paying special attention to exclude factor variables which can be particularly costly to process: Use the
`variables`

argument. - Do not compute standard errors: Use the
`vcov = FALSE`

argument.

This simulation illustrates how computation time varies for a model with 25 regressors and 100,000 observations:

```
library(marginaleffects)
# simulate data and fit a large model
N <- 1e5
dat <- data.frame(matrix(rnorm(N * 26), ncol = 26))
mod <- lm(X1 ~ ., dat)
results <- bench::mark(
# marginal effects at the mean; no standard error
slopes(mod, vcov = FALSE, newdata = "mean"),
# marginal effects at the mean
slopes(mod, newdata = datagrid()),
# 1 variable; no standard error
slopes(mod, vcov = FALSE, variables = "X3"),
# 1 variable
slopes(mod, variables = "X3"),
# 26 variables; no standard error
slopes(mod, vcov = FALSE),
# 26 variables
slopes(mod),
iterations = 1, check = FALSE)
results[, c(1, 3, 5)]
# <bch:expr> <bch:tm> <bch:byt>
# 1 slopes(mod, vcov = FALSE, newdata = "mean") 141.04ms 233.94MB
# 2 slopes(mod, newdata = "mean") 276.61ms 236.18MB
# 3 slopes(mod, vcov = FALSE, variables = "X3") 193.81ms 385.33MB
# 4 slopes(mod, variables = "X3") 2.85s 3.14GB
# 5 slopes(mod, vcov = FALSE) 4.32s 7.62GB
# 6 slopes(mod) 1.15m 76.55GB
```

The benchmarks above were conducted using the development version of `marginaleffects`

on 2022-04-15.

## Speed comparison

The `slopes`

functions are relatively fast. This simulation was conducted using the development version of the package on 2022-04-04:

```
library(margins)
N <- 1e3
dat <- data.frame(
y = sample(0:1, N, replace = TRUE),
x1 = rnorm(N),
x2 = rnorm(N),
x3 = rnorm(N),
x4 = factor(sample(letters[1:5], N, replace = TRUE)))
mod <- glm(y ~ x1 + x2 + x3 + x4, data = dat, family = binomial)
```

`marginaleffects`

is about 6x faster than `margins`

when unit-level standard errors are *not* computed:

```
results <- bench::mark(
slopes(mod, vcov = FALSE),
margins(mod, unit_ses = FALSE),
check = FALSE, relative = TRUE)
results[, c(1, 3, 5)]
# expression median mem_alloc
# <bch:expr> <dbl> <dbl>
# 1 slopes(mod, vcov = FALSE) 1 1
# 2 margins(mod, unit_ses = FALSE) 6.15 4.17
```

`marginaleffects`

can be nearly 600x times faster than `margins`

when unit-level standard errors are computed:

```
results <- bench::mark(
slopes(mod, vcov = TRUE),
margins(mod, unit_ses = TRUE),
check = FALSE, relative = TRUE, iterations = 1)
results[, c(1, 3, 5)]
# expression median mem_alloc
# 1 slopes(mod, vcov = TRUE) 1 1
# 2 margins(mod, unit_ses = TRUE) 581. 20.5
```

Models estimated on larger datasets (> 1000 observations) can be difficult to process using the `margins`

package, because of memory and time constraints. In contrast, `marginaleffects`

can work well on much larger datasets.

In some cases, `marginaleffects`

will be considerably slower than packages like `emmeans`

or `modmarg`

. This is because these packages make extensive use of hard-coded analytical derivatives, or reimplement their own fast prediction functions.