Creates balance tables with summary statistics for different subsets of the data (e.g., control and treatment groups). It can also be used to create summary tables for full data sets. See the Details and Examples sections below, and the vignettes on the modelsummary website:

• https://vincentarelbundock.github.io/modelsummary/

• https://vincentarelbundock.github.io/modelsummary/articles/datasummary.html

## Usage

datasummary_balance(
formula,
data,
output = "default",
fmt = 1,
title = NULL,
notes = NULL,
align = NULL,
dinm = TRUE,
dinm_statistic = "std.error",
escape = TRUE,
...
)

## Arguments

formula

a one-sided formula with the "condition" or "column" variable on the right-hand side. ~1 can be used to show summary statistics for the full data set

data

A data.frame (or tibble). If this data includes columns called "blocks", "clusters", and/or "weights", the "estimatr" package will consider them when calculating the difference in means. If there is a weights column, the reported mean and standard errors will also be weighted.

output

filename or object type (character string)

• Supported filename extensions: .docx, .html, .tex, .md, .txt, .png, .jpg.

• Supported object types: "default", "html", "markdown", "latex", "latex_tabular", "data.frame", "gt", "kableExtra", "huxtable", "flextable", "jupyter". The "modelsummary_list" value produces a lightweight object which can be saved and fed back to the modelsummary function.

• Warning: Users should not supply a file name to the output argument if they intend to customize the table with external packages. See the 'Details' section.

• LaTeX compilation requires the booktabs and siunitx packages, but siunitx can be disabled or replaced with global options. See the 'Details' section.

• The default output formats and table-making packages can be modified with global options. See the 'Details' section.

fmt

determines how to format numeric values

• integer: the number of digits to keep after the period format(round(x, fmt), nsmall=fmt)

• character: passed to the sprintf function (e.g., '%.3f' keeps 3 digits with trailing zero). See ?sprintf

• function: returns a formatted character string.

• NULL: does not format numbers, which allows users to include function in the "glue" strings in the estimate and statistic arguments.

title

string

notes

list or vector of notes to append to the bottom of the table.

align

A string with a number of characters equal to the number of columns in the table (e.g., align = "lcc"). Valid characters: l, c, r, d.

• "l": left-aligned column

• "c": centered column

• "r": right-aligned column

• "d": dot-aligned column. Only supported for LaTeX/PDF tables produced by kableExtra. These commands must appear in the LaTeX preamble (they are added automatically when compiling Rmarkdown documents to PDF):

• \usepackage{booktabs}

• \usepackage{siunitx}

• \newcolumntype{d}{S[input-symbols = ()]}

a data.frame (or tibble) with the same number of rows as your main table.

a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. You can define a "position" attribute of integers to set the row positions. See Examples section below.

dinm

TRUE calculates a difference in means with uncertainty estimates. This option is only available if the estimatr package is installed. If data includes columns named "blocks", "clusters", or "weights", this information will be taken into account automatically by estimatr::difference_in_means.

dinm_statistic

string: "std.error" or "p.value"

escape

boolean TRUE escapes or substitutes LaTeX/HTML characters which could prevent the file from compiling/displaying. This setting does not affect captions or notes.

...

all other arguments are passed through to the table-making functions kableExtra::kbl or gt::gt, depending on the output argument. This allows users to pass arguments directly to datasummary in order to affect the behavior of other functions behind the scenes.

## Global Options

The behavior of modelsummary can be affected by setting global options:

• modelsummary_factory_default

• modelsummary_factory_latex

• modelsummary_factory_html

• modelsummary_factory_png

• modelsummary_get

• modelsummary_format_numeric_latex

• modelsummary_format_numeric_html

### Table-making packages

modelsummary supports 4 table-making packages: kableExtra, gt, flextable, and huxtable. Some of these packages have overlapping functionalities. For example, 3 of those packages can export to LaTeX. To change the default backend used for a specific file format, you can use the options function:

options(modelsummary_factory_html = 'kableExtra') options(modelsummary_factory_latex = 'gt') options(modelsummary_factory_word = 'huxtable') options(modelsummary_factory_png = 'gt')

### Model extraction functions

modelsummary can use two sets of packages to extract information from statistical models: the easystats family (performance and parameters) and broom. By default, it uses easystats first and then falls back on broom in case of failure. You can change the order of priorities or include goodness-of-fit extracted by both packages by setting:

options(modelsummary_get = "broom") options(modelsummary_get = "easystats") options(modelsummary_get = "all")

### Formatting numeric entries

By default, LaTeX tables enclose all numeric entries in the \num{} command from the siunitx package. To prevent this behavior, or to enclose numbers in dollar signs (for LaTeX math mode), users can call:

options(modelsummary_format_numeric_latex = "plain") options(modelsummary_format_numeric_latex = "mathmode")

A similar option can be used to display numerical entries using MathJax in HTML tables:

options(modelsummary_format_numeric_html = "mathjax")

## References

Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” Journal of Statistical Software, 103(1), 1-23. doi:10.18637/jss.v103.i01 .'

## Examples

if (FALSE) {
datasummary_balance(~am, mtcars)
}