modelsummary includes a powerful set of utilities to customize the information displayed in your model summary tables. You can easily rename, reorder, subset or omit parameter estimates; choose the set of goodness-of-fit statistics to display; display various “robust” standard errors or confidence intervals; add titles, footnotes, or source notes; insert stars or custom characters to indicate levels of statistical significance; or add rows with supplemental information about your models.

library(modelsummary)
#> Loading required package: tables
#> 
#> Attaching package: 'modelsummary'
#> The following object is masked from 'package:tables':
#> 
#>     All
library(kableExtra)
library(gt)

url <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'
dat <- read.csv(url)

models <- list()
models[['OLS 1']] <- lm(Donations ~ Literacy + Clergy, data = dat)
models[['Poisson 1']] <- glm(Donations ~ Literacy + Commerce, family = poisson, data = dat)
models[['OLS 2']] <- lm(Crime_pers ~ Literacy + Clergy, data = dat)
models[['Poisson 2']] <- glm(Crime_pers ~ Literacy + Commerce, family = poisson, data = dat)
models[['OLS 3']] <- lm(Crime_prop ~ Literacy + Clergy, data = dat)

Uncertainty estimates: SE, t, p, CI

By default, modelsummary prints an uncertainty estimate in parentheses below the corresponding coefficient estimate. The value of this estimate is determined by the statistic argument.

statistic must be a string which equal to conf.int or to one of the columns produced by the broom::tidy function.

msummary(models, statistic = 'std.error')
msummary(models, statistic = 'p.value')
msummary(models, statistic = 'statistic')

You can display confidence intervals in brackets by setting statistic="conf.int":

msummary(models, statistic = 'conf.int', conf_level = .99)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
(Intercept) 7948.667 8.241 16259.384 9.876 11243.544
[2469.565, 13427.769] [8.226, 8.256] [9375.457, 23143.311] [9.867, 9.885] [8577.542, 13909.546]
Clergy 15.257 77.148 -16.376
[-52.591, 83.105] [-8.096, 162.392] [-49.389, 16.637]
Literacy -39.121 0.003 3.680 -0.000 -68.507
[-136.804, 58.562] [0.003, 0.003] [-119.048, 126.408] [-0.000, -0.000] [-116.037, -20.976]
Commerce 0.011 0.001
[0.011, 0.011] [0.001, 0.001]
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441

To display uncertainty estimates next to coefficients instead of below them:

msummary(models, statistic_vertical = FALSE)

You can override the uncertainty estimates in a number of ways. First, you can specify a function that produces variance-covariance matrices:

library(sandwich)
msummary(models, statistic_override = vcovHC, statistic = 'p.value')

You can supply a list of functions of the same length as your model list:

msummary(models,
   statistic_override = list(vcov, vcovHC, vcovHAC, vcovHC, vcov))

You can supply a list of named variance-covariance matrices:

vcov_matrices <- lapply(models, vcovHC)
msummary(models, statistic_override = vcov_matrices)

You can supply a list of named vectors:

custom_stats <- list(`OLS 1` = c(`(Intercept)` = 2, Literacy = 3, Clergy = 4),
                     `Poisson 1` = c(`(Intercept)` = 3, Literacy = -5, Commerce = 3),
                     `OLS 2` = c(`(Intercept)` = 7, Literacy = -6, Clergy = 9),
                     `Poisson 2` = c(`(Intercept)` = 4, Literacy = -7, Commerce = -9),
                     `OLS 3` = c(`(Intercept)` = 1, Literacy = -5, Clergy = -2))
msummary(models, statistic_override = custom_stats)

You can also display several different uncertainty estimates below the coefficient estimates. For example,

msummary(models, statistic = c('std.error', 'p.value', 'conf.int'))

Will produce something like this:

Titles

You can add a title to your table as follows:

msummary(models, title = 'This is a title for my table.')

Notes

Add notes to the bottom of your table:

msummary(models,
   notes = list('Text of the first note.',
                'Text of the second note.'))

Rename, reorder, and subset

modelsummary offers a powerful and innovative mechanism to rename, reorder, and subset coefficients and goodness-of-fit statistics.

Coefficient estimates

The coef_map argument is a named vector which allows users to rename, reorder, and subset coefficient estimates. Values of this vector correspond to the “clean” variable name. Names of this vector correspond to the “raw” variable name. The table will be sorted in the order in which terms are presented in coef_map. Coefficients which are not included in coef_map will be excluded from the table.

cm <- c('Literacy' = 'Literacy (%)',
        'Commerce' = 'Patents per capita',
        '(Intercept)' = 'Constant')
msummary(models, coef_map = cm)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
Literacy (%) -39.121 0.003 3.680 -0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Patents per capita 0.011 0.001
(0.000) (0.000)
Constant 7948.667 8.241 16259.384 9.876 11243.544
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441

An alternative mechanism to subset coefficients is to use the coef_omit argument. This string is a regular expression which will be fed to stringr::str_detect to detect the variable names which should be excluded from the table.

msummary(models, coef_omit = 'Intercept|Donation')

Goodness-of-fit and other statistics

gof_omit is a regular expression which will be fed to stringr::str_detect to detect the names of the statistics which should be excluded from the table.

msummary(models, gof_omit = 'DF|Deviance|R2|AIC|BIC')

A more powerful mechanism is to supply a data.frame (or tibble) through the gof_map argument. This data.frame must include 4 columns:

  1. raw: a string with the name of a column produced by broom::glance(model).
  2. clean: a string with the “clean” name of the statistic you want to appear in your final table.
  3. fmt: a string which will be used to round/format the string in question (e.g., "%.3f"). This follows the same standards as the fmt argument in ?modelsummary.
  4. omit: TRUE if you want the statistic to be omitted from your final table.

You can see an example of a valid data frame by typing modelsummary::gof_map. This is the default data.frame that modelsummary uses to subset and reorder goodness-of-fit statistics. As you can see, omit == TRUE for quite a number of statistics. You can include setting omit == FALSE:

gm <- modelsummary::gof_map
gm$omit <- FALSE
msummary(models, gof_map = gm)

The goodness-of-fit statistics will be printed in the table in the same order as in the gof_map data.frame.

Notice the subtle difference between coef_map and gof_map. On the one hand, coef_map works as a “white list”: any coefficient not explicitly entered will be omitted from the table. On the other, gof_map works as a “black list”: statistics need to be explicitly marked for omission.

Stars: Statistical significance markers

Some people like to add “stars” to their model summary tables to mark statistical significance. The stars argument can take three types of input:

  1. NULL omits any stars or special marks (default)
  2. TRUE uses these default values: `* p < 0.1, ** p < 0.05, *** p < 0.01`
  3. Named numeric vector for custom stars.
msummary(models)
msummary(models, stars = TRUE)
msummary(models, stars = c('+' = .1, '&' = .01))

Whenever stars != FALSE, modelsummary adds a note at the bottom of the table automatically. If you would like to omit this note, just use the stars_note argument:

msummary(models, stars = TRUE, stars_note = FALSE)

If you want to create your own stars description, you can add custom notes with the notes argument.

Digits, rounding, exponential notation

The fmt argument defines how numeric values are rounded and presented in the table. This argument follows the sprintf C-library standard. For example,

  • %.3f will keep 3 digits after the decimal point, including trailing zeros.
  • %.5f will keep 5 digits after the decimal point, including trailing zeros.
  • Changing the f for an e will use the exponential decimal representation.

Most users will just modify the 3 in %.3f, but this is a very powerful system, and all users are encouraged to read the details: ?sprintf

msummary(models, fmt = '%.7f')

Add rows manually

Use the add_rows argument to add rows manually to a table. For example, let’s say you estimate two models with a factor variables and you want to insert (a) an empty line to identify the category of reference, and (b) cutomized information at the bottom of the table:

models <- list()
models[['OLS']] <- lm(mpg ~ factor(cyl), mtcars)
models[['Logit']] <- glm(am ~ factor(cyl), mtcars, family = binomial)

We create a data.frame with the following columns: “term”, “position”, “section”, and one column per model. “position” is an integer, and “section” is either “middle” or “bottom”. To build this data.frame, it is useful to call the tribble (note the “r”) from the tibble package:

library(tibble)
rows <- tribble(~term,          ~OLS,  ~Logit, ~section, ~position,
                'factor(cyl)4', '-',   '-',    'middle', 3,
                'Info',         '???', 'XYZ',  'bottom', 4)

msummary(models, add_rows = rows)
OLS Logit
(Intercept) 26.664 0.981
(0.972) (0.677)
factor(cyl)4 - -
factor(cyl)6 -6.921 -1.269
(1.558) (1.021)
factor(cyl)8 -11.564 -2.773
(1.299) (1.021)
Num.Obs. 32 32
R2 0.732
R2 Adj. 0.714
Info ??? XYZ
AIC 170.6 39.9
BIC 176.4 44.3
Log.Lik. -81.282 -16.967

Extra tidy arguments (e.g., exponentiated coefficients)

Users can pass any additional argument they want to the tidy method which is used to extract estimates from a model. For example, in logitistic or Cox proportional hazard models, many users want to exponentiate coefficients to faciliate interpretation. The tidy functions supplied by the broom package allow users to set exponentiate=TRUE to achieve this. In modelsummary, users can use the same argument:

mod_logit <- glm(am ~ mpg, data = mtcars, family = binomial)
msummary(mod_logit, exponentiate = TRUE)

Any argument supported by tidy is thus supported by modelsummary.

Warning: at the moment (2020-05-05), broom::tidy still reports std.error on the original scale. See this discussion on the broom GitHub page.

Customizing by post-processing

Warning: When users supply a file name to the output argument, the table is written immediately to file. This means that users cannot post-process and customize the resulting table using functions from gt or kableExtra. To save a customized table, you should apply all the customization functions you need before saving it using gt::gtsave, kableExtra::save_kable, or another appropriate helper function.