modelsummary includes a powerful set of utilities to customize the information displayed in your model summary tables. You can easily rename, reorder, subset or omit parameter estimates; choose the set of goodness-of-fit statistics to display; display various “robust” standard errors or confidence intervals; add titles, footnotes, or source notes; insert stars or custom characters to indicate levels of statistical significance; or add rows with supplemental information about your models.
library(modelsummary) #> Loading required package: tables #> #> Attaching package: 'modelsummary' #> The following object is masked from 'package:tables': #> #> All library(kableExtra) library(gt) url <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv' dat <- read.csv(url) models <- list() models[['OLS 1']] <- lm(Donations ~ Literacy + Clergy, data = dat) models[['Poisson 1']] <- glm(Donations ~ Literacy + Commerce, family = poisson, data = dat) models[['OLS 2']] <- lm(Crime_pers ~ Literacy + Clergy, data = dat) models[['Poisson 2']] <- glm(Crime_pers ~ Literacy + Commerce, family = poisson, data = dat) models[['OLS 3']] <- lm(Crime_prop ~ Literacy + Clergy, data = dat)
By default, modelsummary prints an uncertainty estimate in parentheses below the corresponding coefficient estimate. The value of this estimate is determined by the statistic argument.
statistic must be a string which equal to conf.int or to one of the columns produced by the broom::tidy function.
msummary(models, statistic = 'std.error') msummary(models, statistic = 'p.value') msummary(models, statistic = 'statistic')
You can display confidence intervals in brackets by setting statistic="conf.int":
msummary(models, statistic = 'conf.int', conf_level = .99)
| OLS 1 | Poisson 1 | OLS 2 | Poisson 2 | OLS 3 | |
|---|---|---|---|---|---|
| (Intercept) | 7948.667 | 8.241 | 16259.384 | 9.876 | 11243.544 |
| [2469.565, 13427.769] | [8.226, 8.256] | [9375.457, 23143.311] | [9.867, 9.885] | [8577.542, 13909.546] | |
| Clergy | 15.257 | 77.148 | -16.376 | ||
| [-52.591, 83.105] | [-8.096, 162.392] | [-49.389, 16.637] | |||
| Literacy | -39.121 | 0.003 | 3.680 | -0.000 | -68.507 |
| [-136.804, 58.562] | [0.003, 0.003] | [-119.048, 126.408] | [-0.000, -0.000] | [-116.037, -20.976] | |
| Commerce | 0.011 | 0.001 | |||
| [0.011, 0.011] | [0.001, 0.001] | ||||
| Num.Obs. | 86 | 86 | 86 | 86 | 86 |
| R2 | 0.020 | 0.065 | 0.152 | ||
| R2 Adj. | -0.003 | 0.043 | 0.132 | ||
| AIC | 1740.8 | 274160.8 | 1780.0 | 257564.4 | 1616.9 |
| BIC | 1750.6 | 274168.2 | 1789.9 | 257571.7 | 1626.7 |
| Log.Lik. | -866.392 | -137077.401 | -886.021 | -128779.186 | -804.441 |
To display uncertainty estimates next to coefficients instead of below them:
msummary(models, statistic_vertical = FALSE)
You can override the uncertainty estimates in a number of ways. First, you can specify a function that produces variance-covariance matrices:
You can supply a list of functions of the same length as your model list:
You can supply a list of named variance-covariance matrices:
You can supply a list of named vectors:
custom_stats <- list(`OLS 1` = c(`(Intercept)` = 2, Literacy = 3, Clergy = 4), `Poisson 1` = c(`(Intercept)` = 3, Literacy = -5, Commerce = 3), `OLS 2` = c(`(Intercept)` = 7, Literacy = -6, Clergy = 9), `Poisson 2` = c(`(Intercept)` = 4, Literacy = -7, Commerce = -9), `OLS 3` = c(`(Intercept)` = 1, Literacy = -5, Clergy = -2)) msummary(models, statistic_override = custom_stats)
You can also display several different uncertainty estimates below the coefficient estimates. For example,
Will produce something like this:

You can add a title to your table as follows:
msummary(models, title = 'This is a title for my table.')
modelsummary offers a powerful and innovative mechanism to rename, reorder, and subset coefficients and goodness-of-fit statistics.
The coef_map argument is a named vector which allows users to rename, reorder, and subset coefficient estimates. Values of this vector correspond to the “clean” variable name. Names of this vector correspond to the “raw” variable name. The table will be sorted in the order in which terms are presented in coef_map. Coefficients which are not included in coef_map will be excluded from the table.
cm <- c('Literacy' = 'Literacy (%)', 'Commerce' = 'Patents per capita', '(Intercept)' = 'Constant') msummary(models, coef_map = cm)
| OLS 1 | Poisson 1 | OLS 2 | Poisson 2 | OLS 3 | |
|---|---|---|---|---|---|
| Literacy (%) | -39.121 | 0.003 | 3.680 | -0.000 | -68.507 |
| (37.052) | (0.000) | (46.552) | (0.000) | (18.029) | |
| Patents per capita | 0.011 | 0.001 | |||
| (0.000) | (0.000) | ||||
| Constant | 7948.667 | 8.241 | 16259.384 | 9.876 | 11243.544 |
| (2078.276) | (0.006) | (2611.140) | (0.003) | (1011.240) | |
| Num.Obs. | 86 | 86 | 86 | 86 | 86 |
| R2 | 0.020 | 0.065 | 0.152 | ||
| R2 Adj. | -0.003 | 0.043 | 0.132 | ||
| AIC | 1740.8 | 274160.8 | 1780.0 | 257564.4 | 1616.9 |
| BIC | 1750.6 | 274168.2 | 1789.9 | 257571.7 | 1626.7 |
| Log.Lik. | -866.392 | -137077.401 | -886.021 | -128779.186 | -804.441 |
An alternative mechanism to subset coefficients is to use the coef_omit argument. This string is a regular expression which will be fed to stringr::str_detect to detect the variable names which should be excluded from the table.
msummary(models, coef_omit = 'Intercept|Donation')
gof_omit is a regular expression which will be fed to stringr::str_detect to detect the names of the statistics which should be excluded from the table.
msummary(models, gof_omit = 'DF|Deviance|R2|AIC|BIC')
A more powerful mechanism is to supply a data.frame (or tibble) through the gof_map argument. This data.frame must include 4 columns:
raw: a string with the name of a column produced by broom::glance(model).clean: a string with the “clean” name of the statistic you want to appear in your final table.fmt: a string which will be used to round/format the string in question (e.g., "%.3f"). This follows the same standards as the fmt argument in ?modelsummary.omit: TRUE if you want the statistic to be omitted from your final table.You can see an example of a valid data frame by typing modelsummary::gof_map. This is the default data.frame that modelsummary uses to subset and reorder goodness-of-fit statistics. As you can see, omit == TRUE for quite a number of statistics. You can include setting omit == FALSE:
The goodness-of-fit statistics will be printed in the table in the same order as in the gof_map data.frame.
Notice the subtle difference between coef_map and gof_map. On the one hand, coef_map works as a “white list”: any coefficient not explicitly entered will be omitted from the table. On the other, gof_map works as a “black list”: statistics need to be explicitly marked for omission.
Some people like to add “stars” to their model summary tables to mark statistical significance. The stars argument can take three types of input:
NULL omits any stars or special marks (default)TRUE uses these default values: `* p < 0.1, ** p < 0.05, *** p < 0.01`Whenever stars != FALSE, modelsummary adds a note at the bottom of the table automatically. If you would like to omit this note, just use the stars_note argument:
msummary(models, stars = TRUE, stars_note = FALSE)
If you want to create your own stars description, you can add custom notes with the notes argument.
The fmt argument defines how numeric values are rounded and presented in the table. This argument follows the sprintf C-library standard. For example,
%.3f will keep 3 digits after the decimal point, including trailing zeros.%.5f will keep 5 digits after the decimal point, including trailing zeros.f for an e will use the exponential decimal representation.Most users will just modify the 3 in %.3f, but this is a very powerful system, and all users are encouraged to read the details: ?sprintf
msummary(models, fmt = '%.7f')
Use the add_rows argument to add rows manually to a table. For example, let’s say you estimate two models with a factor variables and you want to insert (a) an empty line to identify the category of reference, and (b) cutomized information at the bottom of the table:
models <- list() models[['OLS']] <- lm(mpg ~ factor(cyl), mtcars) models[['Logit']] <- glm(am ~ factor(cyl), mtcars, family = binomial)
We create a data.frame with the following columns: “term”, “position”, “section”, and one column per model. “position” is an integer, and “section” is either “middle” or “bottom”. To build this data.frame, it is useful to call the tribble (note the “r”) from the tibble package:
library(tibble) rows <- tribble(~term, ~OLS, ~Logit, ~section, ~position, 'factor(cyl)4', '-', '-', 'middle', 3, 'Info', '???', 'XYZ', 'bottom', 4) msummary(models, add_rows = rows)
| OLS | Logit | |
|---|---|---|
| (Intercept) | 26.664 | 0.981 |
| (0.972) | (0.677) | |
| factor(cyl)4 | - | - |
| factor(cyl)6 | -6.921 | -1.269 |
| (1.558) | (1.021) | |
| factor(cyl)8 | -11.564 | -2.773 |
| (1.299) | (1.021) | |
| Num.Obs. | 32 | 32 |
| R2 | 0.732 | |
| R2 Adj. | 0.714 | |
| Info | ??? | XYZ |
| AIC | 170.6 | 39.9 |
| BIC | 176.4 | 44.3 |
| Log.Lik. | -81.282 | -16.967 |
Users can pass any additional argument they want to the tidy method which is used to extract estimates from a model. For example, in logitistic or Cox proportional hazard models, many users want to exponentiate coefficients to faciliate interpretation. The tidy functions supplied by the broom package allow users to set exponentiate=TRUE to achieve this. In modelsummary, users can use the same argument:
mod_logit <- glm(am ~ mpg, data = mtcars, family = binomial) msummary(mod_logit, exponentiate = TRUE)
Any argument supported by tidy is thus supported by modelsummary.
Warning: at the moment (2020-05-05), broom::tidy still reports std.error on the original scale. See this discussion on the broom GitHub page.
Warning: When users supply a file name to the output argument, the table is written immediately to file. This means that users cannot post-process and customize the resulting table using functions from gt or kableExtra. To save a customized table, you should apply all the customization functions you need before saving it using gt::gtsave, kableExtra::save_kable, or another appropriate helper function.