## What to do when `marginaleffects`

is slow?

Some options:

- Compute marginal
effects and contrasts at the mean (or other representative value)
instead of all observed rows of the original dataset: Use the
`newdata`

argument and the`datagrid()`

function. - Compute marginal effects for a subset of variables, paying special
attention to exclude factor variables which can be particularly costly
to process: Use the
`variables`

argument. - Do not compute standard errors: Use the
`vcov = FALSE`

argument.

This simulation illustrates how computation time varies for a model with 25 regressors and 100,000 observations:

```
library(marginaleffects)
# simulate data and fit a large model
N <- 1e5
dat <- data.frame(matrix(rnorm(N * 26), ncol = 26))
mod <- lm(X1 ~ ., dat)
results <- bench::mark(
# marginal effects at the mean; no standard error
slopes(mod, vcov = FALSE, newdata = "mean"),
# marginal effects at the mean
slopes(mod, newdata = "mean"),
# 1 variable; no standard error
slopes(mod, vcov = FALSE, variables = "X3"),
# 1 variable
slopes(mod, variables = "X3"),
# 26 variables; no standard error
slopes(mod, vcov = FALSE),
# 26 variables
slopes(mod),
iterations = 1, check = FALSE)
results[, c(1, 3, 5)]
# <bch:expr> <bch:tm> <bch:byt>
# slopes(mod, vcov = FALSE, newdata = "mean") 230.09ms 1.24GB
# slopes(mod, newdata = "mean") 329.14ms 1.25GB
# slopes(mod, vcov = FALSE, variables = "X3") 198.7ms 496.24MB
# slopes(mod, variables = "X3") 1.27s 3.29GB
# slopes(mod, vcov = FALSE) 5.73s 11.05GB
# slopes(mod) 21.68s 78.02GB
```

The benchmarks above were conducted using the development version of
`marginaleffects`

on 2023-02-03.

## Speed comparison

The `slopes`

function is relatively fast. This simulation
was conducted using the development version of the package on
2022-04-04:

```
library(margins)
N <- 1e3
dat <- data.frame(
y = sample(0:1, N, replace = TRUE),
x1 = rnorm(N),
x2 = rnorm(N),
x3 = rnorm(N),
x4 = factor(sample(letters[1:5], N, replace = TRUE)))
mod <- glm(y ~ x1 + x2 + x3 + x4, data = dat, family = binomial)
```

`marginaleffects`

is about the same speed as
`margins`

when unit-level standard errors are *not*
computed:

```
results <- bench::mark(
slopes(mod, vcov = FALSE),
margins(mod, unit_ses = FALSE),
check = FALSE, relative = TRUE)
results[, c(1, 3, 5)]
# expression median mem_alloc
# <bch:expr> <dbl> <dbl>
# 1 slopes(mod, vcov = FALSE) 1 1
# 2 margins(mod, unit_ses = FALSE) 6.15 4.17
```

`marginaleffects`

can be 100x times faster than
`margins`

when unit-level standard errors are computed:

```
results <- bench::mark(
slopes(mod, vcov = TRUE),
margins(mod, unit_ses = TRUE),
check = FALSE, relative = TRUE, iterations = 1)
results[, c(1, 3, 5)]
# <bch:expr> <dbl> <dbl>
# slopes(mod, vcov = TRUE) 1 1
# margins(mod, unit_ses = TRUE) 128. 18.6
```

Models estimated on larger datasets (> 1000 observations) can be
difficult to process using the `margins`

package, because of
memory and time constraints. In contrast, `marginaleffects`

can work well on much larger datasets.

In some cases, `marginaleffects`

will be considerably
slower than packages like `emmeans`

or `modmarg`

.
This is because these packages make extensive use of hard-coded
analytical derivatives, or reimplement their own fast prediction
functions.