When analysts create tables to summarize statistical models, they often want to customize the information that is displayed in those tables (parameters, statistics, significant digits, etc.). The modelsummary function includes a powerful and intuitive set of arguments which allow users to change the content of their tables.

In addition, analysts often want to customize the appearance of their tables. To achieve this, modelsummary supports two table making packages: gt and kableExtra. These two packages open endless possibilities for customization. Each of them has different strengths and weaknesses. For instance, gt allows seamless integration with the RStudio IDE, but kableExtra’s LaTeX (and PDF) output is far more mature. The choice between gt and kableExtra should largely depend on the type of output format that users target:

Users are encouraged to read the documentation of both packages to see which syntax they prefer.

modelsummary can produce tables in a large array of formats. This table shows which package is used by default to create tables in each output format:

Output format Default package
html gt
latex kableExtra
markdown kableExtra
filename.rtf gt
filename.tex kableExtra
filename.md kableExtra
filename.txt kableExtra
filename.png kableExtra
filename.jpg kableExtra
Rmarkdown PDF kableExtra
Rmarkdown HTML gt

Both gt and kableExtra can produce LaTeX and HTML output. You can override the default settings by setting these global options:

options(modelsummary_latex = 'gt')

options(modelsummary_html = 'kableExtra')

Information: modelsummary

modelsummary includes a powerful set of utilities to customize the information displayed in your model summary tables. You can easily rename, reorder, subset or omit parameter estimates; choose the set of goodness-of-fit statistics to display; display various “robust” standard errors or confidence intervals; add titles, footnotes, or source notes; insert stars or custom characters to indicate levels of statistical significance; or add rows with supplemental information about your models.

library(modelsummary)
library(kableExtra)
library(gt)

url <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'
dat <- read.csv(url)

models <- list()
models[['OLS 1']] <- lm(Donations ~ Literacy + Clergy, data = dat)
models[['Poisson 1']] <- glm(Donations ~ Literacy + Commerce, family = poisson, data = dat)
models[['OLS 2']] <- lm(Crime_pers ~ Literacy + Clergy, data = dat)
models[['Poisson 2']] <- glm(Crime_pers ~ Literacy + Commerce, family = poisson, data = dat)
models[['OLS 3']] <- lm(Crime_prop ~ Literacy + Clergy, data = dat)

Uncertainty estimates: SE, t, p, CI

By default, modelsummary prints an uncertainty estimate in parentheses below the corresponding coefficient estimate. The value of this estimate is determined by the statistic argument.

statistic must be a string which equal to conf.int or to one of the columns produced by the broom::tidy function.

msummary(models, statistic = 'std.error')
msummary(models, statistic = 'p.value')
msummary(models, statistic = 'statistic')

You can display confidence intervals in brackets by setting statistic="conf.int":

msummary(models, statistic = 'conf.int', conf_level = .99)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
(Intercept) 7948.667 8.241 16259.384 9.876 11243.544
[2469.565, 13427.769] [8.226, 8.256] [9375.457, 23143.311] [9.867, 9.885] [8577.542, 13909.546]
Clergy 15.257 77.148 -16.376
[-52.591, 83.105] [-8.096, 162.392] [-49.389, 16.637]
Literacy -39.121 0.003 3.680 -0.000 -68.507
[-136.804, 58.562] [0.003, 0.003] [-119.048, 126.408] [-0.000, -0.000] [-116.037, -20.976]
Commerce 0.011 0.001
[0.011, 0.011] [0.001, 0.001]
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
Adj.R2 -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441

To display uncertainty estimates next to coefficients instead of below them:

msummary(models, statistic_vertical = FALSE)

You can override the uncertainty estimates in a number of ways. First, you can specify a function that produces variance-covariance matrices:

library(sandwich)
msummary(models, statistic_override = vcovHC, statistic = 'p.value')

You can supply a list of functions of the same length as your model list:

msummary(models,
   statistic_override = list(vcov, vcovHC, vcovHAC, vcovHC, vcov))

You can supply a list of named variance-covariance matrices:

vcov_matrices <- lapply(models, vcovHC)
msummary(models, statistic_override = vcov_matrices)

You can supply a list of named vectors:

custom_stats <- list(`OLS 1` = c(`(Intercept)` = 2, Literacy = 3, Clergy = 4),
                     `Poisson 1` = c(`(Intercept)` = 3, Literacy = -5, Commerce = 3),
                     `OLS 2` = c(`(Intercept)` = 7, Literacy = -6, Clergy = 9),
                     `Poisson 2` = c(`(Intercept)` = 4, Literacy = -7, Commerce = -9),
                     `OLS 3` = c(`(Intercept)` = 1, Literacy = -5, Clergy = -2))
msummary(models, statistic_override = custom_stats)

You can also display several different uncertainty estimates below the coefficient estimates. For example,

msummary(models, statistic = c('std.error', 'p.value', 'conf.int'))

Will produce something like this:

Titles

You can add a title to your table as follows:

msummary(models, title = 'This is a title for my table.')

Notes

Add notes to the bottom of your table:

msummary(models,
   notes = list('Text of the first note.',
                'Text of the second note.'))

Rename, reorder, and subset

modelsummary offers a powerful and innovative mechanism to rename, reorder, and subset coefficients and goodness-of-fit statistics.

Coefficient estimates

The coef_map argument is a named vector which allows users to rename, reorder, and subset coefficient estimates. Values of this vector correspond to the “clean” variable name. Names of this vector correspond to the “raw” variable name. The table will be sorted in the order in which terms are presented in coef_map. Coefficients which are not included in coef_map will be excluded from the table.

cm <- c('Literacy' = 'Literacy (%)',
        'Commerce' = 'Patents per capita',
        '(Intercept)' = 'Constant')
msummary(models, coef_map = cm)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
Literacy (%) -39.121 0.003 3.680 -0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Patents per capita 0.011 0.001
(0.000) (0.000)
Constant 7948.667 8.241 16259.384 9.876 11243.544
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
Adj.R2 -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441

An alternative mechanism to subset coefficients is to use the coef_omit argument. This string is a regular expression which will be fed to stringr::str_detect to detect the variable names which should be excluded from the table.

msummary(models, coef_omit = 'Intercept|Donation')

Goodness-of-fit and other statistics

gof_omit is a regular expression which will be fed to stringr::str_detect to detect the names of the statistics which should be excluded from the table.

msummary(models, gof_omit = 'DF|Deviance|R2|AIC|BIC')

A more powerful mechanism is to supply a data.frame (or tibble) through the gof_map argument. This data.frame must include 4 columns:

  1. raw: a string with the name of a column produced by broom::glance(model).
  2. clean: a string with the “clean” name of the statistic you want to appear in your final table.
  3. fmt: a string which will be used to round/format the string in question (e.g., "%.3f"). This follows the same standards as the fmt argument in ?modelsummary.
  4. omit: TRUE if you want the statistic to be omitted from your final table.

You can see an example of a valid data frame by typing modelsummary::gof_map. This is the default data.frame that modelsummary uses to subset and reorder goodness-of-fit statistics. As you can see, omit == TRUE for quite a number of statistics. You can include setting omit == FALSE:

gm <- modelsummary::gof_map
gm$omit <- FALSE
msummary(models, gof_map = gm)

The goodness-of-fit statistics will be printed in the table in the same order as in the gof_map data.frame.

Notice the subtle difference between coef_map and gof_map. On the one hand, coef_map works as a “white list”: any coefficient not explicitly entered will be omitted from the table. On the other, gof_map works as a “black list”: statistics need to be explicitly marked for omission.

Stars: Statistical significance markers

Some people like to add “stars” to their model summary tables to mark statistical significance. The stars argument can take three types of input:

  1. NULL omits any stars or special marks (default)
  2. TRUE uses these default values: `* p < 0.1, ** p < 0.05, *** p < 0.01`
  3. Named numeric vector for custom stars.
msummary(models)
msummary(models, stars = TRUE)
msummary(models, stars = c('+' = .1, '&' = .01))

Whenever stars != FALSE, modelsummary adds a note at the bottom of the table automatically. If you would like to omit this note, just use the stars_note argument:

msummary(models, stars = TRUE, stars_note = FALSE)

If you want to create your own stars description, you can add custom notes with the notes argument.

Digits, rounding, exponential notation

The fmt argument defines how numeric values are rounded and presented in the table. This argument follows the sprintf C-library standard. For example,

  • %.3f will keep 3 digits after the decimal point, including trailing zeros.
  • %.5f will keep 5 digits after the decimal point, including trailing zeros.
  • Changing the f for an e will use the exponential decimal representation.

Most users will just modify the 3 in %.3f, but this is a very powerful system, and all users are encouraged to read the details: ?sprintf

msummary(models, fmt = '%.7f')

Add rows manually

Use the add_rows argument to add rows manually to the bottom of the table.

row1 <- c('Custom row 1', 'a', 'b', 'c', 'd', 'e')
row2 <- c('Custom row 2', 5:1)
msummary(models, add_rows = list(row1, row2))

Use the add_rows argument to specify where the custom rows should be displayed in the bottom panel. For example, this prints custom rows after the coefficients, but at first position in the goodness of fit measures:

msummary(models, add_rows = list(row1, row2), add_rows_location = 0)

This prints custom rows after the 2nd GOF statistic:

msummary(models, add_rows = list(row1, row2), add_rows_location = 2)

Extra tidy arguments (e.g., exponentiated coefficients)

Users can pass any additional argument they want to the tidy method which is used to extract estimates from a model. For example, in logitistic or Cox proportional hazard models, many users want to exponentiate coefficients to faciliate interpretation. The tidy functions supplied by the broom package allow users to set exponentiate=TRUE to achieve this. In modelsummary, users can use the same argument:

mod_logit <- glm(am ~ mpg, data = mtcars, family = binomial)
msummary(mod_logit, exponentiate = TRUE)

Any argument supported by tidy is thus supported by modelsummary.

Warning: at the moment (2020-05-05), broom::tidy still reports std.error on the original scale. See this discussion on the broom GitHub page.

Customizing by post-processing

Warning: When users supply a file name to the output argument, the table is written immediately to file. This means that users cannot post-process and customize the resulting table using functions from gt or kableExtra. To save a customized table, you should apply all the customization functions you need before saving it using gt::gtsave, kableExtra::save_kable, or another appropriate helper function.

Appearance: gt

Fonts, colors, and styles

Thanks to gt, modelsummary accepts markdown indications for emphasis and more:

msummary(models,
         title = md('This is a **bolded series of words.**'),
         notes = list(md('And an *emphasized note*.')))

We can modify the size of the text with gt’s tab_style function:

msummary(models) %>%
    tab_style(style = cell_text(size = 'x-large'),
              locations = cells_body(columns = 1))

We can also color columns and cells, and present values in bold or italics:

msummary(models) %>%
    tab_style(style = cell_fill(color = "lightcyan"),
              locations = cells_body(columns = vars(`OLS 1`))) %>%
    tab_style(style = cell_fill(color = "#F9E3D6"),
              locations = cells_body(columns = vars(`Poisson 2`), rows = 2:6)) %>%
    tab_style(style = cell_text(weight = "bold"),
              locations = cells_body(columns = vars(`OLS 1`))) %>%
    tab_style(style = cell_text(style = "italic"),
              locations = cells_body(columns = vars(`Poisson 2`), rows = 2:6))
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
(Intercept) 7948.667 8.241 16259.384 9.876 11243.544
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Clergy 15.257 77.148 -16.376
(25.735) (32.334) (12.522)
Literacy -39.121 0.003 3.680 -0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Commerce 0.011 0.001
(0.000) (0.000)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
Adj.R2 -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441

Column groups

Create spanning labels to group models (columns):

msummary(models) %>%
    tab_spanner(label = 'Literacy', columns = c('OLS 1', 'Poisson 1')) %>%
    tab_spanner(label = 'Desertion', columns = c('OLS 2', 'Poisson 2')) %>%
    tab_spanner(label = 'Clergy', columns = 'OLS 3')

Images

Insert images in your tables using the gt::text_transform and gt::local_image functions.

f <- function(x) web_image(url = "https://user-images.githubusercontent.com/987057/82732352-b9aabf00-9cda-11ea-92a6-26750cf097d0.png", height = 80)

msummary(models) %>%
    text_transform(locations = cells_body(columns = 2:6, rows = 1), fn = f)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
(Intercept)
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Clergy 15.257 77.148 -16.376
(25.735) (32.334) (12.522)
Literacy -39.121 0.003 3.680 -0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Commerce 0.011 0.001
(0.000) (0.000)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
Adj.R2 -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441

Complex example

This is the code I used to generate the “complex” table posted at the top of this README.

cm <- c('Literacy' = 'Literacy (%)',
        'Clergy' = 'Priests/capita',
        'Commerce' = 'Patents/capita',
        'Infants' = 'Infants',
        '(Intercept)' = 'Constant')

msummary(models,
         coef_map = cm,
         stars = TRUE,
         gof_omit = "Deviance",
         title = 'modelsummary package for R',
         notes = c('The most important parameter is printed in red.')) %>%
    tab_spanner(label = 'Donations', columns = 2:3) %>%
    tab_spanner(label = 'Crimes (persons)', columns = 4:5) %>%
    tab_spanner(label = 'Crimes (property)', columns = 6) %>%
    tab_footnote(footnote = md("Very **important** variable."),
                 locations = cells_body(rows = 3, columns = 1)) %>%
    tab_style(style = cell_text(color = 'red'),
              locations = cells_body(rows = 3, columns = 4))

Appearance: kableExtra

Note that compiling this LaTeX table requires loading the booktabs and xcolor packages in the preamble of your LaTeX or Rmarkdown document.

The gt LaTeX render engine is still immature. Until it improves, I strongly recommend that users turn to kableExtra to produce LaTeX tables. This package offers robust functions that allow a lot of customization. A simple LaTeX table can be produced as follows:

msummary(models, output = 'latex')

We can use functions from the kableExtra package to customize this table, with bold and colored cells, column spans, and more.

Fonts, colors and styles

The row_spec and column_spec allow users to change the styling of their tables. For instance, this code creates a table where the first column is in bold blue text on pink background:

msummary(models, output = 'latex') %>%
    row_spec(1, bold = TRUE, color = 'blue', background = 'pink')

Column groups

You can define column group labels using kableExtra’s add_header_above function:

msummary(models, output = 'latex') %>%
    add_header_above(c(" " = 1, 
                       "Donations" = 2, 
                       "Crimes (person)" = 2, 
                       "Crimes (property)" = 1))

Complex example

cm <- c('Literacy' = 'Literacy (%)',
        'Clergy' = 'Priests/capita',
        'Commerce' = 'Patents/capita',
        'Infants' = 'Infants',
        '(Intercept)' = 'Constant')

msummary(models,
    coef_map = cm,
    stars = TRUE,
    gof_omit = "Deviance",
    title = 'modelsummary package for R',
    notes = c('First custom note to contain text.',
              'Second custom note with different content.')) %>%
    add_header_above(c(" " = 1,
                       "Donations" = 2,
                       "Crimes (person)" = 2,
                       "Crimes (property)" = 1))
   row_spec(3, bold = TRUE, color = 'blue', background = 'pink')