## Definition

Slopes are defined as:

Partial derivatives of the regression equation with respect to a regressor of interest. a.k.a. Marginal effects, trends.

This vignette follows the econometrics tradition by referring to “slopes” and “marginal effects” interchangeably. In this context, the word “marginal” refers to the idea of a “small change,” in the calculus sense.

A marginal effect measures the association between a change in a regressor \(x\), and a change in the response \(y\). Put differently, differently, the marginal effect is the slope of the prediction function, measured at a specific value of the regressor \(x\).

Marginal effects are extremely useful, because they are intuitive and easy to interpret. They are often the main quantity of interest in an empirical analysis.

In scientific practice, the “Marginal Effect” falls in the same
toolbox as the “Contrast.”
Both try to answer a counterfactual question: What would happen to \(y\) if \(x\) were different? They allow us to model
the “effect” of a change/difference in the regressor \(x\) on the response \(y\).^{1}

To illustrate the concept, consider this quadratic function:

\[y = -x^2\]

From the definition above, we know that the marginal effect is the partial derivative of \(y\) with respect to \(x\):

\[\frac{\partial y}{\partial x} = -2x\]

To get intuition about how to interpret this quantity, consider the response of \(y\) to \(x\). It looks like this:

When \(x\) increases, \(y\) starts to increase. But then, as \(x\) increases further, \(y\) creeps back down in negative territory.

A marginal effect is the slope of this response function at a certain value of \(x\). The next plot adds three tangent lines, highlighting the slopes of the response function for three values of \(x\). The slopes of these tangents tell us three things:

- When \(x<0\), the slope is positive: an increase in \(x\) is associated with an increase in \(y\): The marginal effect is positive.
- When \(x=0\), the slope is null: a (small) change in \(x\) is associated with no change in \(y\). The marginal effect is null.
- When \(x>0\), the slope is negative: an increase in \(x\) is associated with a decrease in \(y\). The marginal effect is negative.

Below, we show how to reach the same conclusions in an estimation
context, with simulated data and the `slopes`

function.

##
`slopes`

function

The marginal effect is a *unit-level* measure of association
between changes in a regressor and changes in the response. Except in
the simplest linear models, the value of the marginal effect will be
different from individual to individual, because it will depend on the
values of the other covariates for each individual.

The `slopes`

function thus produces distinct estimates of
the marginal effect for each row of the data used to fit the model. The
output of `marginaleffects`

is a simple
`data.frame`

, which can be inspected with all the usual
`R`

commands.

To show this, we load the library, download the Palmer
Penguins data from the `Rdatasets`

archive, and estimate a GLM model:

```
library(marginaleffects)
dat <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv")
dat$large_penguin <- ifelse(dat$body_mass_g > median(dat$body_mass_g, na.rm = TRUE), 1, 0)
mod <- glm(large_penguin ~ bill_length_mm + flipper_length_mm + species,
data = dat, family = binomial)
```

```
mfx <- slopes(mod)
head(mfx)
#>
#> Term Contrast Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> bill_length_mm dY/dX 0.0176 0.00830 2.12 0.03359 0.00137 0.0339
#> bill_length_mm dY/dX 0.0359 0.01229 2.92 0.00354 0.01176 0.0600
#> bill_length_mm dY/dX 0.0844 0.02108 4.01 < 0.001 0.04312 0.1258
#> bill_length_mm dY/dX 0.0347 0.00642 5.41 < 0.001 0.02214 0.0473
#> bill_length_mm dY/dX 0.0509 0.01350 3.77 < 0.001 0.02444 0.0773
#> bill_length_mm dY/dX 0.0165 0.00770 2.14 0.03202 0.00142 0.0316
#>
#> Columns: rowid, term, contrast, estimate, std.error, statistic, p.value, conf.low, conf.high, predicted, predicted_hi, predicted_lo, large_penguin, bill_length_mm, flipper_length_mm, species
```

## The Marginal Effects Zoo

A dataset with one marginal effect estimate per unit of observation is a bit unwieldy and difficult to interpret. There are ways to make this information easier to digest, by computing various quantities of interest. In a characteristically excellent blog post, Professor Andrew Heiss introduces many such quantities:

- Average Marginal Effects
- Group-Average Marginal Effects
- Marginal Effects at User-Specified Values (or Representative Values)
- Marginal Effects at the Mean
- Counterfactual Marginal Effects
- Conditional Marginal Effects

The rest of this vignette defines each of those quantities and
explains how to use the `slopes()`

and
`plot_slopes()`

functions to compute them. The main
differences between these quantities pertain to (a) the regressor values
at which we estimate marginal effects, and (b) the way in which
unit-level marginal effects are aggregated.

Heiss drew this exceedingly helpful graph which summarizes the information in the rest of this vignette:

## Average Marginal Effect (AME)

A dataset with one marginal effect estimate per unit of observation
is a bit unwieldy and difficult to interpret. Many analysts like to
report the “Average Marginal Effect”, that is, the average of all the
observation-specific marginal effects. These are easy to compute based
on the full `data.frame`

shown above, but the
`avg_slopes()`

function is convenient:

```
avg_slopes(mod)
#>
#> Term Contrast Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> bill_length_mm dY/dX 0.0276 0.00578 4.772 <0.001 0.01625 0.0389
#> flipper_length_mm dY/dX 0.0106 0.00235 4.509 <0.001 0.00598 0.0152
#> species Chinstrap - Adelie -0.4148 0.05659 -7.330 <0.001 -0.52570 -0.3039
#> species Gentoo - Adelie 0.0617 0.10688 0.577 0.564 -0.14778 0.2712
#>
#> Columns: term, contrast, estimate, std.error, statistic, p.value, conf.low, conf.high
```

Note that since marginal effects are derivatives, they are only
properly defined for continuous numeric variables. When the model also
includes categorical regressors, the `summary`

function will
try to display relevant (regression-adjusted) contrasts between
different categories, as shown above.

You can also extract average marginal effects using `tidy`

and `glance`

methods which conform to the `broom`

package
specification:

```
tidy(mfx)
#> # A tibble: 4 × 8
#> term contrast estimate std.error statistic p.value conf.low conf.high
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 bill_length_mm mean(dY/dX) 0.0276 0.00578 4.77 1.82e- 6 0.0162 0.0389
#> 2 flipper_length_mm mean(dY/dX) 0.0106 0.00235 4.51 6.52e- 6 0.00598 0.0152
#> 3 species mean(Chinstrap) - mean(Adelie) -0.415 0.0566 -7.33 2.31e-13 -0.526 -0.304
#> 4 species mean(Gentoo) - mean(Adelie) 0.0617 0.107 0.577 5.64e- 1 -0.148 0.271
glance(mfx)
#> # A tibble: 1 × 7
#> aic bic r2.tjur rmse nobs F logLik
#> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <logLik>
#> 1 180. 199. 0.695 0.276 342 15.7 -84.92257
```

## Group-Average Marginal Effect (G-AME)

We can also use the `by`

argument the average marginal
effects *within different subgroups* of the observed data, based
on values of the regressors. For example, to compute the average
marginal effects of Bill Length for each Species, we do:

```
avg_slopes(
mod,
by = "species",
variables = "bill_length_mm")
#>
#> Term Contrast species Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> bill_length_mm mean(dY/dX) Adelie 0.04354 0.00879 4.95 <0.001 0.0263 0.06077
#> bill_length_mm mean(dY/dX) Chinstrap 0.03680 0.00976 3.77 <0.001 0.0177 0.05594
#> bill_length_mm mean(dY/dX) Gentoo 0.00287 0.00284 1.01 0.312 -0.0027 0.00844
#>
#> Columns: term, contrast, species, estimate, std.error, statistic, p.value, conf.low, conf.high, predicted, predicted_hi, predicted_lo
```

This is equivalent to manually taking the mean of the observation-level marginal effect for each species sub-group:

```
aggregate(
mfx$estimate,
by = list(mfx$species, mfx$term),
FUN = mean)
#> Group.1 Group.2 x
#> 1 Adelie bill_length_mm 0.043539914
#> 2 Chinstrap bill_length_mm 0.036801185
#> 3 Gentoo bill_length_mm 0.002871562
#> 4 Adelie flipper_length_mm 0.016710631
#> 5 Chinstrap flipper_length_mm 0.014124217
#> 6 Gentoo flipper_length_mm 0.001102231
#> 7 Adelie species -0.054519623
#> 8 Chinstrap species -0.313337522
#> 9 Gentoo species -0.250726004
```

Note that `marginaleffects`

follows `Stata`

and
the `margins`

package in computing standard errors using the
group-wise averaged Jacobian.

## Marginal Effect at User-Specified Values

Sometimes, we are not interested in *all* the unit-specific
marginal effects, but would rather look at the estimated marginal
effects for certain “typical” individuals, or for user-specified values
of the regressors. The `datagrid`

function helps us build a
data grid full of “typical” rows. For example, to generate artificial
Adelies and Gentoos with 180mm flippers:

```
datagrid(flipper_length_mm = 180,
species = c("Adelie", "Gentoo"),
model = mod)
#> large_penguin bill_length_mm flipper_length_mm species
#> 1 0.4853801 43.92193 180 Adelie
#> 2 0.4853801 43.92193 180 Gentoo
```

The same command can be used (omitting the `model`

argument) to `marginaleffects`

’s `newdata`

argument to compute marginal effects for those (fictional)
individuals:

```
slopes(
mod,
newdata = datagrid(
flipper_length_mm = 180,
species = c("Adelie", "Gentoo")))
#>
#> Term Contrast Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 % bill_length_mm flipper_length_mm species
#> bill_length_mm dY/dX 0.0607 0.03323 1.827 0.0677 -0.00443 0.12581 43.9 180 Adelie
#> bill_length_mm dY/dX 0.0847 0.03939 2.150 0.0316 0.00747 0.16187 43.9 180 Gentoo
#> flipper_length_mm dY/dX 0.0233 0.00550 4.232 <0.001 0.01250 0.03408 43.9 180 Adelie
#> flipper_length_mm dY/dX 0.0325 0.00851 3.817 <0.001 0.01581 0.04918 43.9 180 Gentoo
#> species Chinstrap - Adelie -0.2111 0.10618 -1.988 0.0468 -0.41916 -0.00294 43.9 180 Adelie
#> species Chinstrap - Adelie -0.2111 0.10618 -1.988 0.0468 -0.41916 -0.00294 43.9 180 Gentoo
#> species Gentoo - Adelie 0.1591 0.30242 0.526 0.5988 -0.43361 0.75185 43.9 180 Adelie
#> species Gentoo - Adelie 0.1591 0.30242 0.526 0.5988 -0.43361 0.75185 43.9 180 Gentoo
#>
#> Columns: rowid, term, contrast, estimate, std.error, statistic, p.value, conf.low, conf.high, predicted, predicted_hi, predicted_lo, large_penguin, bill_length_mm, flipper_length_mm, species
```

When variables are omitted from the `datagrid`

call, they
will automatically be set at their mean or mode (depending on variable
type).

## Marginal Effect at the Mean (MEM)

The “Marginal Effect at the Mean” is a marginal effect calculated for
a hypothetical observation where each regressor is set at its mean or
mode. By default, the `datagrid`

function that we used in the
previous section sets all regressors to their means or modes. To
calculate the MEM, we can set the `newdata`

argument, which
determines the values of predictors at which we want to compute marginal
effects:

```
slopes(mod, newdata = "mean")
#>
#> Term Contrast Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> bill_length_mm dY/dX 0.0502 0.01238 4.059 <0.001 0.02598 0.0745
#> flipper_length_mm dY/dX 0.0193 0.00553 3.487 <0.001 0.00844 0.0301
#> species Chinstrap - Adelie -0.8070 0.07636 -10.569 <0.001 -0.95670 -0.6574
#> species Gentoo - Adelie 0.0829 0.11453 0.723 0.469 -0.14161 0.3073
#>
#> Columns: rowid, term, contrast, estimate, std.error, statistic, p.value, conf.low, conf.high, predicted, predicted_hi, predicted_lo, large_penguin, bill_length_mm, flipper_length_mm, species
```

## Counterfactual Marginal Effects

The `datagrid`

function allowed us look at completely
fictional individuals. Setting the `grid_type`

argument of
this function to `"counterfactual"`

lets us compute the
marginal effects for the actual observations in our dataset, but with a
few manipulated values. For example, this code will create a
`data.frame`

twice as long as the original `dat`

,
where each observation is repeated with different values of the
`flipper_length_mm`

variable:

We see that the rows 1, 2, and 3 of the original dataset have been
replicated twice, with different values of the
`flipper_length_mm`

variable:

```
nd[nd$rowid %in% 1:3,]
#> rowidcf large_penguin bill_length_mm species flipper_length_mm
#> 1 1 0 39.1 Adelie 160
#> 2 2 0 39.5 Adelie 160
#> 3 3 0 40.3 Adelie 160
#> 343 1 0 39.1 Adelie 180
#> 344 2 0 39.5 Adelie 180
#> 345 3 0 40.3 Adelie 180
```

We can use the observation-level marginal effects to compute average (or median, or anything else) marginal effects over the counterfactual individuals:

## Conditional Marginal Effects (Plot)

The `plot_slopes`

function can be used to draw
“Conditional Marginal Effects.” This is useful when a model includes
interaction terms and we want to plot how the marginal effect of a
variable changes as the value of a “condition” (or “moderator”) variable
changes:

```
mod <- lm(mpg ~ hp * wt + drat, data = mtcars)
plot_slopes(mod, variables = "hp", condition = "wt")
```

The marginal effects in the plot above were computed with values of
all regressors – except the `variables`

and the
`condition`

– held at their means or modes, depending on
variable type.

Since `plot_slopes()`

produces a `ggplot2`

object, it is easy to customize. For example:

```
plot_slopes(mod, variables = "hp", condition = "wt") +
geom_rug(aes(x = wt), data = mtcars) +
theme_classic()
```

## Example: Quadratic

In the “Definition” section of this vignette, we considered how
marginal effects can be computed analytically in a simple quadratic
equation context. We can now use the `slopes`

function to
replicate our analysis of the quadratic function in a regression
application.

Say you estimate a linear regression model with a quadratic term:

\[Y = \beta_0 + \beta_1 X^2 + \varepsilon\]

and obtain estimates of \(\beta_0=1\) and \(\beta_1=2\). Taking the partial derivative with respect to \(X\) and plugging in our estimates gives us the marginal effect of \(X\) on \(Y\):

\[\partial Y / \partial X = \beta_0 + 2 \cdot \beta_1 X\] \[\partial Y / \partial X = 1 + 4X\]

This result suggests that the effect of a *change* in \(X\) on \(Y\) depends on the *level* of \(X\). When \(X\) is large and positive, an increase in
\(X\) is associated to a large increase
in \(Y\). When \(X\) is small and positive, an increase in
\(X\) is associated to a small increase
in \(Y\). When \(X\) is a large negative value, an increase
in \(X\) is associated with a
*decrease* in \(Y\).

`marginaleffects`

arrives at the same conclusion in
simulated data:

```
library(tidyverse)
N <- 1e5
quad <- data.frame(x = rnorm(N))
quad$y <- 1 + 1 * quad$x + 2 * quad$x^2 + rnorm(N)
mod <- lm(y ~ x + I(x^2), quad)
slopes(mod, newdata = datagrid(x = -2:2)) |>
mutate(truth = 1 + 4 * x) |>
select(estimate, truth)
#>
#> Estimate
#> -6.995
#> -2.998
#> 0.998
#> 4.995
#> 8.991
#>
#> Columns: estimate, truth
```

We can plot conditional adjusted predictions with
`plot_predictions`

function:

`plot_predictions(mod, condition = "x")`

We can plot conditional marginal effects with the
`plot_slopes`

function (see section below):

`plot_slopes(mod, variables = "x", condition = "x")`

Again, the conclusion is the same. When \(x<0\), an increase in \(x\) is associated with an decrease in \(y\). When \(x>1/4\), the marginal effect is positive, which suggests that an increase in \(x\) is associated with an increase in \(y\).

## Slopes vs Predictions: A Visual Interpretation

Often, analysts will plot predicted values of the outcome with a best fit line:

```
library(ggplot2)
mod <- lm(mpg ~ hp * qsec, data = mtcars)
plot_predictions(mod, condition = "hp", vcov = TRUE) +
geom_point(data = mtcars, aes(hp, mpg))
```

The slope of this line is calculated using the same technique we all learned in grade school: dividing rise over run.

```
p <- plot_predictions(mod, condition = "hp", vcov = TRUE, draw = FALSE)
plot_predictions(mod, condition = "hp", vcov = TRUE) +
geom_segment(aes(x = p$hp[10], xend = p$hp[10], y = p$estimate[10], yend = p$estimate[20])) +
geom_segment(aes(x = p$hp[10], xend = p$hp[20], y = p$estimate[20], yend = p$estimate[20])) +
annotate("text", label = "Rise", y = 10, x = 140) +
annotate("text", label = "Run", y = 2, x = 200)
```

Instead of computing this slope manually, we can just call:

```
avg_slopes(mod, variables = "hp")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> hp -0.112 0.0126 -8.92 <0.001 -0.137 -0.0874
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
```

Now, consider the fact that our model includes an interaction between
`hp`

and `qsec`

. This means that the slope will
actually differ based on the value of the moderator variable
`qsec`

:

`plot_predictions(mod, condition = list("hp", "qsec" = "quartile"))`

We can estimate the slopes of these three fit lines easily:

```
slopes(
mod,
variables = "hp",
newdata = datagrid(qsec = quantile(mtcars$qsec, probs = c(.25, .5, .75))))
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 % hp qsec
#> hp -0.0934 0.0111 -8.43 <0.001 -0.115 -0.0717 147 16.9
#> hp -0.1093 0.0123 -8.92 <0.001 -0.133 -0.0853 147 17.7
#> hp -0.1325 0.0154 -8.60 <0.001 -0.163 -0.1023 147 18.9
#>
#> Columns: rowid, term, estimate, std.error, statistic, p.value, conf.low, conf.high, predicted, predicted_hi, predicted_lo, mpg, hp, qsec
```

As we see in the graph, all three slopes are negative, but the Q3 slope is steepest.

We could then push this one step further, and measure the slope of
`mpg`

with respect to `hp`

, *for all observed
values* of `qsec`

. This is achieved with the
`plot_slopes()`

function:

```
plot_slopes(mod, variables = "hp", condition = "qsec") +
geom_hline(yintercept = 0, linetype = 3)
```

This plot shows that the marginal effect of `hp`

on
`mpg`

is always negative (the slope is always below zero),
and that this effect becomes even more negative as `qsec`

increases.

## Prediction types

The `marginaleffect`

function takes the derivative of the
fitted (or predicted) values of the model, as is typically generated by
the `predict(model)`

function. By default,
`predict`

produces predictions on the `"response"`

scale, so the marginal effects should be interpreted on that scale.
However, users can pass a string or a vector of strings to the
`type`

argument, and `marginaleffects`

will
consider different outcomes.

Typical values include `"response"`

and
`"link"`

, but users should refer to the documentation of the
`predict`

of the package they used to fit the model to know
what values are allowable. documentation.

```
mod <- glm(am ~ mpg, family = binomial, data = mtcars)
avg_slopes(mod, type = "response")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> mpg 0.0465 0.00887 5.24 <0.001 0.0291 0.0639
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
avg_slopes(mod, type = "link")
#>
#> Term Estimate Std. Error z Pr(>|z|) 2.5 % 97.5 %
#> mpg 0.307 0.115 2.67 0.00751 0.0819 0.532
#>
#> Columns: term, estimate, std.error, statistic, p.value, conf.low, conf.high
```