modelsummary includes a powerful set of utilities to customize the information displayed in your model summary tables. You can easily rename, reorder, subset or omit parameter estimates; choose the set of goodness-of-fit statistics to display; display various “robust” standard errors or confidence intervals; add titles, footnotes, or source notes; insert stars or custom characters to indicate levels of statistical significance; or add rows with supplemental information about your models.

library(modelsummary)
library(kableExtra)
library(gt)

url <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'
dat <- read.csv(url)

models <- list(
  "OLS 1"     = lm(Donations ~ Literacy + Clergy, data = dat),
  "Poisson 1" = glm(Donations ~ Literacy + Commerce, family = poisson, data = dat),
  "OLS 2"     = lm(Crime_pers ~ Literacy + Clergy, data = dat),
  "Poisson 2" = glm(Crime_pers ~ Literacy + Commerce, family = poisson, data = dat),
  "OLS 3"     = lm(Crime_prop ~ Literacy + Clergy, data = dat)
)

modelsummary(models)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
(Intercept) 7948.667 8.241 16259.384 9.876 11243.544
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Literacy -39.121 0.003 3.680 0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Clergy 15.257 77.148 -16.376
(25.735) (32.334) (12.522)
Commerce 0.011 0.001
(0.000) (0.000)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441
F 0.866 2.903 7.441

Uncertainty estimates: SE, t, p, CI

By default, modelsummary prints an uncertainty estimate in parentheses below the corresponding coefficient estimate. The value of this estimate is determined by the statistic argument.

statistic must be a string which equal to conf.int or to one of the columns produced by the broom::tidy function.

modelsummary(models, statistic = 'std.error')
modelsummary(models, statistic = 'p.value')
modelsummary(models, statistic = 'statistic')

You can display confidence intervals in brackets by setting statistic="conf.int":

modelsummary(models, statistic = 'conf.int', conf_level = .99)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
(Intercept) 7948.667 8.241 16259.384 9.876 11243.544
[2469.565, 13427.769] [8.226, 8.256] [9375.457, 23143.311] [9.867, 9.885] [8577.542, 13909.546]
Literacy -39.121 0.003 3.680 0.000 -68.507
[-136.804, 58.562] [0.003, 0.003] [-119.048, 126.408] [0.000, 0.000] [-116.037, -20.976]
Clergy 15.257 77.148 -16.376
[-52.591, 83.105] [-8.096, 162.392] [-49.389, 16.637]
Commerce 0.011 0.001
[0.011, 0.011] [0.001, 0.001]
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441
F 0.866 2.903 7.441

To display uncertainty estimates next to coefficients instead of below them:

modelsummary(models, statistic_vertical = FALSE)

You can override the uncertainty estimates in a number of ways. First, you can specify a function that produces variance-covariance matrices:

library(sandwich)
modelsummary(models, statistic_override = vcovHC, statistic = 'p.value')

You can supply a list of functions of the same length as your model list:

modelsummary(models,
   statistic_override = list(vcov, vcovHC, vcovHAC, vcovHC, vcov))

You can supply a list of named variance-covariance matrices:

vcov_matrices <- lapply(models, vcovHC)
modelsummary(models, statistic_override = vcov_matrices)

You can supply a list of named vectors:

custom_stats <- list(`OLS 1` = c(`(Intercept)` = 2, Literacy = 3, Clergy = 4),
                     `Poisson 1` = c(`(Intercept)` = 3, Literacy = -5, Commerce = 3),
                     `OLS 2` = c(`(Intercept)` = 7, Literacy = -6, Clergy = 9),
                     `Poisson 2` = c(`(Intercept)` = 4, Literacy = -7, Commerce = -9),
                     `OLS 3` = c(`(Intercept)` = 1, Literacy = -5, Clergy = -2))
modelsummary(models, statistic_override = custom_stats)

You can also display several different uncertainty estimates below the coefficient estimates. For example,

modelsummary(models, statistic = c('std.error', 'p.value', 'conf.int'))

Will produce something like this:

Alignment

By default, modelsummary will align the first column (with coefficient names) to the left, and will center the results columns. To change this default, you can use the align argument, which accepts a string or vector of strings:

modelsummary(models, align="lrrrrr")
modelsummary(models, align=c("l", "r", "r", "r", "r", "r"))

Many users want to align values on the decimal dot. Unfortunately, I am not aware of a single way to achieve this outcome for all output formats (pdf/latex, markdown, html). However, it is possible to do decimal alignment in LaTeX or Rmarkdown PDF documents, by using the dcolumn LaTeX package. Consider this Rmarkdown document:

Compiling the code above produces this table:

Titles

You can add a title to your table as follows:

modelsummary(models, title = 'This is a title for my table.')

Notes

Add notes to the bottom of your table:

modelsummary(models,
   notes = list('Text of the first note.',
                'Text of the second note.'))

Rename, reorder, and subset

modelsummary offers a powerful and innovative mechanism to rename, reorder, and subset coefficients and goodness-of-fit statistics.

Coefficient estimates

The coef_map argument is a named vector which allows users to rename, reorder, and subset coefficient estimates. Values of this vector correspond to the “clean” variable name. Names of this vector correspond to the “raw” variable name. The table will be sorted in the order in which terms are presented in coef_map. Coefficients which are not included in coef_map will be excluded from the table.

cm <- c('Literacy' = 'Literacy (%)',
        'Commerce' = 'Patents per capita',
        '(Intercept)' = 'Constant')
modelsummary(models, coef_map = cm)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
Literacy (%) -39.121 0.003 3.680 0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Patents per capita 0.011 0.001
(0.000) (0.000)
Constant 7948.667 8.241 16259.384 9.876 11243.544
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Num.Obs. 86 86 86 86 86
R2 0.020 0.065 0.152
R2 Adj. -0.003 0.043 0.132
AIC 1740.8 274160.8 1780.0 257564.4 1616.9
BIC 1750.6 274168.2 1789.9 257571.7 1626.7
Log.Lik. -866.392 -137077.401 -886.021 -128779.186 -804.441
F 0.866 2.903 7.441

An alternative mechanism to subset coefficients is to use the coef_omit argument. This string is a regular expression which will be fed to stringr::str_detect to detect the variable names which should be excluded from the table.

modelsummary(models, coef_omit = "Intercept|Commerce", gof_omit=".*")
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
Literacy -39.121 0.003 3.680 0.000 -68.507
(37.052) (0.000) (46.552) (0.000) (18.029)
Clergy 15.257 77.148 -16.376
(25.735) (32.334) (12.522)

Since coef_omit accepts regexes, you can do interesting things with it, such as specifying the list of variables that modelsummary should keep instead of omit. To do this, we use the [^] construct that specifies what not to match, we use the | pipe operator to separate the list of variables to include:

modelsummary(models, coef_omit = "[^Commerce|(Intercept)]", gof_omit=".*")
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
(Intercept) 7948.667 8.241 16259.384 9.876 11243.544
(2078.276) (0.006) (2611.140) (0.003) (1011.240)
Commerce 0.011 0.001
(0.000) (0.000)

Goodness-of-fit and other statistics

gof_omit is a regular expression which will be fed to stringr::str_detect to detect the names of the statistics which should be excluded from the table.

modelsummary(models, gof_omit = 'DF|Deviance|R2|AIC|BIC')

A more powerful mechanism is to supply a data.frame (or tibble) through the gof_map argument. This data.frame must include 4 columns:

  1. raw: a string with the name of a column produced by broom::glance(model).
  2. clean: a string with the “clean” name of the statistic you want to appear in your final table.
  3. fmt: a string which will be used to round/format the string in question (e.g., "%.3f"). This follows the same standards as the fmt argument in ?modelsummary.
  4. omit: TRUE if you want the statistic to be omitted from your final table.

You can see an example of a valid data frame by typing modelsummary::gof_map. This is the default data.frame that modelsummary uses to subset and reorder goodness-of-fit statistics. As you can see, omit == TRUE for quite a number of statistics. You can include setting omit == FALSE:

gm <- modelsummary::gof_map
gm$omit <- FALSE
modelsummary(models, gof_map = gm)

The goodness-of-fit statistics will be printed in the table in the same order as in the gof_map data.frame.

Notice the subtle difference between coef_map and gof_map. On the one hand, coef_map works as a “white list”: any coefficient not explicitly entered will be omitted from the table. On the other, gof_map works as a “black list”: statistics need to be explicitly marked for omission.

Stars: Statistical significance markers

Some people like to add “stars” to their model summary tables to mark statistical significance. The stars argument can take three types of input:

  1. NULL omits any stars or special marks (default)
  2. TRUE uses these default values: * p < 0.1, ** p < 0.05, *** p < 0.01
  3. Named numeric vector for custom stars.
modelsummary(models)
modelsummary(models, stars = TRUE)
modelsummary(models, stars = c('+' = .1, '&' = .01))

Whenever stars != FALSE, modelsummary adds a note at the bottom of the table automatically. If you would like to omit this note, just use the stars_note argument:

modelsummary(models, stars = TRUE, stars_note = FALSE)

If you want to create your own stars description, you can add custom notes with the notes argument.

Rounding

The fmt argument defines how numeric values are rounded and presented in the table. This argument follows the sprintf C-library standard. For example,

  • %.3f will keep 3 digits after the decimal point, including trailing zeros.
  • %.5f will keep 5 digits after the decimal point, including trailing zeros.
  • Changing the f for an e will use the exponential decimal representation.

Most users will just modify the 3 in %.3f, but this is a very powerful system, and all users are encouraged to read the details: ?sprintf

modelsummary(models, fmt = '%.7f')

add_rows

Use the add_rows argument to add rows manually to a table. For example, let’s say you estimate two models with a factor variables and you want to insert (a) an empty line to identify the category of reference, and (b) cutomized information at the bottom of the table:

models <- list()
models[['OLS']] <- lm(mpg ~ factor(cyl), mtcars)
models[['Logit']] <- glm(am ~ factor(cyl), mtcars, family = binomial)

We create a data.frame with the same number of columns as the summary table. Then, we define a “position” attribute to specify where the new rows should be inserted in the table. Finally, we pass this data.frame to the add_rows argument:

library(tibble)
rows <- tribble(~term,          ~OLS,  ~Logit,
                'factor(cyl)4', '-',   '-',
                'Info',         '???', 'XYZ')
attr(rows, 'position') <- c(3, 9)

modelsummary(models, add_rows = rows)
OLS Logit
(Intercept) 26.664 0.981
(0.972) (0.677)
factor(cyl)4
factor(cyl)6 -6.921 -1.269
(1.558) (1.021)
factor(cyl)8 -11.564 -2.773
(1.299) (1.021)
Num.Obs. 32 32
Info ??? XYZ
R2 0.732
R2 Adj. 0.714
AIC 170.6 39.9
BIC 176.4 44.3
Log.Lik. -81.282 -16.967
F 39.698

Extra tidy arguments (e.g., exponentiated coefficients)

Users can pass any additional argument they want to the tidy method which is used to extract estimates from a model. For example, in logitistic or Cox proportional hazard models, many users want to exponentiate coefficients to faciliate interpretation. The tidy functions supplied by the broom package allow users to set exponentiate=TRUE to achieve this. In modelsummary, users can use the same argument:

mod_logit <- glm(am ~ mpg, data = mtcars, family = binomial)
modelsummary(mod_logit, exponentiate = TRUE)

Any argument supported by tidy is thus supported by modelsummary.

Warning: at the moment (2020-05-05), broom::tidy still reports std.error on the original scale. See this discussion on the broom GitHub page.

Customizing the look of your table

To customize the appearance of tables, modelsummary supports four of the most popular table-making packages:

  1. gt: https://gt.rstudio.com
  2. kableExtra: http://haozhu233.github.io/kableExtra
  3. huxtable: https://hughjonesd.github.io/huxtable/
  4. flextable: https://davidgohel.github.io/flextable/

Users are encouraged to visit these websites to determine which package suits their needs best. Each of them has different strengths and weaknesses. For instance, gt allows seamless integration with the RStudio IDE, but kableExtra’s LaTeX (and PDF) output is far more mature.

To create customized tables, the analyst begins by calling modelsummary(models) to create a summary table. Then, she post-processes the table by applying functions from one of the packages listed above. It is often convenient to use the %>% operator to do this.

To illustrate, we download data from the Rdatasets repository and we estimate 5 models:

library(modelsummary)

url <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'
dat <- read.csv(url)

models <- list()
models[['OLS 1']] <- lm(Donations ~ Literacy, data = dat)
models[['Poisson 1']] <- glm(Donations ~ Literacy + Clergy, family = poisson, data = dat)
models[['OLS 2']] <- lm(Crime_pers ~ Literacy, data = dat)
models[['Poisson 2']] <- glm(Crime_pers ~ Literacy + Clergy, family = poisson, data = dat)
models[['OLS 3']] <- lm(Crime_prop ~ Literacy + Clergy, data = dat)

In the rest of this vignette, we will customize tables using tools supplied by the gt, kableExtra, flextable, and huxtable packages. In each case, the pattern will be similar. First, we create a table by calling modelsummary and by specifying the output format with the output parameter. Then, we will use functions from the four packages to customize the appearance of our tables.

gt

To illustrate how to customize tables using the gt package we will use the following functions from the gt package:

  • tab_spanner creates labels to group columns.
  • tab_footnote adds a footnote and a matching marking in a specific cell.
  • tab_style can modify the text and color of rows, columns, or cells.

To produce a “cleaner” look, we will also use modelsummary’s stars, coef_map, gof_omit, and title arguments.

Note that in order to access gt functions, we must first load the library.

library(gt)

# build table with `modelsummary` 
cm <- c( '(Intercept)' = 'Constant', 'Literacy' = 'Literacy (%)', 'Clergy' = 'Priests/capita')
cap <- 'A modelsummary table customized with gt'

tab <- modelsummary(models,
                output = "gt",
                coef_map = cm, stars = TRUE,
                title = cap, gof_omit = 'IC|Log|Adj')

# customize table with `gt`

tab %>%

    # column labels
    tab_spanner(label = 'Donations', columns = 2:3) %>%
    tab_spanner(label = 'Crimes (persons)', columns = 4:5) %>%
    tab_spanner(label = 'Crimes (property)', columns = 6) %>%

    # footnote
    tab_footnote(footnote = md("A very **important** variable."),
                 locations = cells_body(rows = 3, columns = 1)) %>%

    # text and background color
    tab_style(style = cell_text(color = 'red'),
              locations = cells_body(rows = 3)) %>%
    tab_style(style = cell_fill(color = 'lightblue'),
              locations = cells_body(rows = 5))
A modelsummary table customized with gt
Donations Crimes (persons) Crimes (property)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
Constant 8759.068*** 8.986*** 20357.309*** 9.708*** 11243.544***
(1559.363) (0.004) (2020.980) (0.003) (1011.240)
Literacy (%)1 -42.886 -0.006*** -15.358 0.000*** -68.507***
(36.362) (0.000) (47.127) (0.000) (18.029)
Priests/capita 0.002*** 0.004*** -16.376
(0.000) (0.000) (12.522)
Num.Obs. 86 86 86 86 86
R2 0.016 0.001 0.152
F 1.391 0.106 7.441
* p < 0.1, ** p < 0.05, *** p < 0.01

1 A very important variable.

The gt website offers many more examples. The possibilities are endless. For instance, gt allows you to embed images in your tables using the text_transform and local_image functions:

f <- function(x) web_image(url = "https://user-images.githubusercontent.com/987057/82732352-b9aabf00-9cda-11ea-92a6-26750cf097d0.png", height = 80)

tab %>%
    text_transform(locations = cells_body(columns = 2:6, rows = 1), fn = f)
A modelsummary table customized with gt
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
Constant
(1559.363) (0.004) (2020.980) (0.003) (1011.240)
Literacy (%) -42.886 -0.006*** -15.358 0.000*** -68.507***
(36.362) (0.000) (47.127) (0.000) (18.029)
Priests/capita 0.002*** 0.004*** -16.376
(0.000) (0.000) (12.522)
Num.Obs. 86 86 86 86 86
R2 0.016 0.001 0.152
F 1.391 0.106 7.441
* p < 0.1, ** p < 0.05, *** p < 0.01

kableExtra

We will now illustrate how to customize tables using functions from the kableExtra package:

  • add_header_above creates labels to group columns.
  • add_footnote adds a footnote and a matching marking in a specific cell.
  • row_spec can modify the text and color of rows, columns, or cells.

We use the same code as above, but specify output='kableExtra' in the modelsummary() call:

library(kableExtra)

# build table with `modelsummary` 
cm <- c( '(Intercept)' = 'Constant', 'Literacy' = 'Literacy (%)', 'Clergy' = 'Priests/capita')
cap <- 'A modelsummary table customized with kableExtra'

tab <- modelsummary(models, output = 'kableExtra',
                coef_map = cm, stars = TRUE,
                title = cap, gof_omit = 'IC|Log|Adj')

# customize table with `kableExtra`
tab %>%

    # column labels
    add_header_above(c(" " = 1, "Donations" = 2, "Crimes (person)" = 2, "Crimes (property)" = 1)) %>%

    # text and background color
    row_spec(3, color = 'red') %>%
    row_spec(5, background = 'lightblue')
A modelsummary table customized with kableExtra
Donations
Crimes (person)
Crimes (property)
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
Constant 8759.068*** 8.986*** 20357.309*** 9.708*** 11243.544***
(1559.363) (0.004) (2020.980) (0.003) (1011.240)
Literacy (%) -42.886 -0.006*** -15.358 0.000*** -68.507***
(36.362) (0.000) (47.127) (0.000) (18.029)
Priests/capita 0.002*** 0.004*** -16.376
(0.000) (0.000) (12.522)
Num.Obs. 86 86 86 86 86
R2 0.016 0.001 0.152
F 1.391 0.106 7.441
* p < 0.1, ** p < 0.05, *** p < 0.01

These kableExtra functions can be used to produce LaTeX / PDF tables such as this one:

flextable

We will now illustrate how to customize tables using functions from the flextable package:

  • color to modify the color of the text
  • bg to modify the color of the background
  • autofit sets column width to sensible values.

We use the same code as above, but specify output='flextable' in the modelsummary() call:

library(flextable)

# build table with `modelsummary` 
cm <- c( '(Intercept)' = 'Constant', 'Literacy' = 'Literacy (%)', 'Clergy' = 'Priests/capita')
cap <- 'A modelsummary table customized with flextable'

tab <- modelsummary(models, output = 'flextable',
                coef_map = cm, stars = TRUE,
                title = cap, gof_omit = 'IC|Log|Adj')

# customize table with `flextable`
tab %>%

    # text and background color
    color(3, color = 'red') %>%
    bg(5, bg = 'lightblue') %>%

    # column widths
    autofit()
A modelsummary table customized with flextable

OLS 1

Poisson 1

OLS 2

Poisson 2

OLS 3

Constant

8759.068***

8.986***

20357.309***

9.708***

11243.544***

(1559.363)

(0.004)

(2020.980)

(0.003)

(1011.240)

Literacy (%)

-42.886

-0.006***

-15.358

0.000***

-68.507***

(36.362)

(0.000)

(47.127)

(0.000)

(18.029)

Priests/capita

0.002***

0.004***

-16.376

(0.000)

(0.000)

(12.522)

Num.Obs.

86

86

86

86

86

R2

0.016

0.001

0.152

F

1.391

0.106

7.441

* p < 0.1, ** p < 0.05, *** p < 0.01

huxtable

We will now illustrate how to customize tables using functions from the huxtable package:

  • set_text_color to change the color of some entries

We use the same code as above, but specify output='huxtable' in the modelsummary() call:

library(huxtable)

# build table with `modelsummary` 
cm <- c( '(Intercept)' = 'Constant', 'Literacy' = 'Literacy (%)', 'Clergy' = 'Priests/capita')
cap <- 'A modelsummary table customized with huxtable'

tab <- modelsummary(models, output = 'huxtable',
                coef_map = cm, stars = TRUE,
                title = cap, gof_omit = 'IC|Log|Adj')

# customize table with `huxtable`
tab %>%

    # text color
    set_text_color(row = 4, col = 1:ncol(.), value = 'red')
A modelsummary table customized with huxtable
OLS 1 Poisson 1 OLS 2 Poisson 2 OLS 3
Constant 8759.068*** 8.986*** 20357.309*** 9.708*** 11243.544***
(1559.363) (0.004) (2020.980) (0.003) (1011.240)
Literacy (%) -42.886 -0.006*** -15.358 0.000*** -68.507***
(36.362) (0.000) (47.127) (0.000) (18.029)
Priests/capita 0.002*** 0.004*** -16.376
(0.000) (0.000) (12.522)
Num.Obs. 86 86 86 86 86
R2 0.016 0.001 0.152
F 1.391 0.106 7.441
* p < 0.1, ** p < 0.05, *** p < 0.01

Warning: Saving to file

When users supply a file name to the output argument, the table is written immediately to file. This means that users cannot post-process and customize the resulting table using functions from gt, kableExtra, huxtable, or flextable. When users specify a filename in the output argument, the modelsummary() call should be the final one in the chain.

This is OK:

modelsummary(models, output = 'table.html')

This is not OK:

modelsummary(models, output = 'table.html') %>%
    tab_spanner(label = 'Literacy', columns = c('OLS 1', 'Poisson 1'))

To save a customized table, you should apply all the customization functions you need before saving it using gt::gtsave, kableExtra::save_kable, or the appropriate helper function from the package that you are using to customize your table.