# Cross tabulations for categorical variables

Source:`R/datasummary_crosstab.R`

`datasummary_crosstab.Rd`

Convenience function to tabulate counts, cell percentages, and row/column
percentages for categorical variables. See the Details section for a
description of the internal design. For more complex cross tabulations, use
datasummary directly. See the Details and Examples sections below,
and the vignettes on the `modelsummary`

website:

https://vincentarelbundock.github.io/modelsummary/

https://vincentarelbundock.github.io/modelsummary/articles/datasummary.html

## Usage

```
datasummary_crosstab(
formula,
statistic = 1 ~ 1 + N + Percent("row"),
data,
output = "default",
fmt = 1,
title = NULL,
notes = NULL,
align = NULL,
add_columns = NULL,
add_rows = NULL,
sparse_header = TRUE,
escape = TRUE,
...
)
```

## Arguments

- formula
A two-sided formula to describe the table: rows ~ columns, where rows and columns are variables in the data. Rows and columns may contain interactions, e.g.,

`var1 * var2 ~ var3`

.- statistic
A formula of the form

`1 ~ 1 + N + Percent("row")`

. The left-hand side may only be empty or contain a`1`

to include row totals. The right-hand side may contain:`1`

for column totals,`N`

for counts,`Percent()`

for cell percentages,`Percent("row")`

for row percentages,`Percent("col")`

for column percentages.- data
A data.frame (or tibble)

- output
filename or object type (character string)

Supported filename extensions: .docx, .html, .tex, .md, .txt, .csv, .xlsx, .png, .jpg

Supported object types: "default", "html", "markdown", "latex", "latex_tabular", "data.frame", "gt", "kableExtra", "huxtable", "flextable", "DT", "jupyter". The "modelsummary_list" value produces a lightweight object which can be saved and fed back to the

`modelsummary`

function.The "default" output format can be set to "kableExtra", "gt", "flextable", "huxtable", "DT", or "markdown"

If the user does not choose a default value, the packages listed above are tried in sequence.

Session-specific configuration:

`options("modelsummary_factory_default" = "gt")`

Persistent configuration:

`modelsummary_config(output = "markdown")`

Warning: Users should not supply a file name to the

`output`

argument if they intend to customize the table with external packages. See the 'Details' section.LaTeX compilation requires the

`booktabs`

and`siunitx`

packages, but`siunitx`

can be disabled or replaced with global options. See the 'Details' section.

- fmt
how to format numeric values: integer, user-supplied function, or

`modelsummary`

function.Integer: Number of decimal digits

User-supplied functions:

Any function which accepts a numeric vector and returns a character vector of the same length.

`modelsummary`

functions:`fmt = fmt_significant(2)`

: Two significant digits (at the term-level)`fmt = fmt_sprintf("%.3f")`

: See`?sprintf`

`fmt = fmt_identity()`

: unformatted raw values

- title
string

- notes
list or vector of notes to append to the bottom of the table.

- align
A string with a number of characters equal to the number of columns in the table (e.g.,

`align = "lcc"`

). Valid characters: l, c, r, d."l": left-aligned column

"c": centered column

"r": right-aligned column

"d": dot-aligned column. For LaTeX/PDF output, this option requires at least version 3.0.25 of the siunitx LaTeX package. These commands must appear in the LaTeX preamble (they are added automatically when compiling Rmarkdown documents to PDF):

`\usepackage{booktabs}`

`\usepackage{siunitx}`

`\newcolumntype{d}{S[ input-open-uncertainty=, input-close-uncertainty=, parse-numbers = false, table-align-text-pre=false, table-align-text-post=false ]}`

- add_columns
a data.frame (or tibble) with the same number of rows as your main table.

- add_rows
a data.frame (or tibble) with the same number of columns as your main table. By default, rows are appended to the bottom of the table. You can define a "position" attribute of integers to set the row positions. See Examples section below.

- sparse_header
TRUE or FALSE. TRUE eliminates column headers which have a unique label across all columns, except for the row immediately above the data. FALSE keeps all headers. The order in which terms are entered in the formula determines the order in which headers appear. For example,

`x~mean*z`

will print the`mean`

-related header above the`z`

-related header.`- escape
boolean TRUE escapes or substitutes LaTeX/HTML characters which could prevent the file from compiling/displaying. This setting does not affect captions or notes.

- ...
all other arguments are passed through to the table-making functions kableExtra::kbl, gt::gt, DT::datatable, etc. depending on the

`output`

argument. This allows users to pass arguments directly to`datasummary`

in order to affect the behavior of other functions behind the scenes.

## Details

`datasummary_crosstab`

is a wrapper around the datasummary
function. This wrapper works by creating a customized formula and by
feeding it to `datasummary`

. The customized formula comes in two parts.

First, we take a two-sided formula supplied by the `formula`

argument.
All variables of that formula are wrapped in a `Factor()`

call to ensure
that the variables are treated as categorical.

Second, the `statistic`

argument gives a two-sided formula which specifies
the statistics to include in the table. `datasummary_crosstab`

modifies
this formula automatically to include "clean" labels.

Finally, the `formula`

and `statistic`

formulas are combined into a single
formula which is fed directly to the `datasummary`

function to produce the
table.

Variables in `formula`

are automatically wrapped in `Factor()`

.

## Global Options

The behavior of `modelsummary`

can be modified by setting global options. For example:

`options(modelsummary_model_labels = "roman")`

The rest of this section describes each of the options above.

### Model labels: default column names

These global option changes the style of the default column headers:

`options(modelsummary_model_labels = "roman")`

`options(modelsummary_panel_labels = "roman")`

The supported styles are: "model", "panel", "arabic", "letters", "roman", "(arabic)", "(letters)", "(roman)""

The panel-specific option is only used when `shape="rbind"`

### Table-making packages

`modelsummary`

supports 4 table-making packages: `kableExtra`

, `gt`

,
`flextable`

, `huxtable`

, and `DT`

. Some of these packages have overlapping
functionalities. For example, 3 of those packages can export to LaTeX. To
change the default backend used for a specific file format, you can use
the `options`

function:

`options(modelsummary_factory_html = 'kableExtra')`

`options(modelsummary_factory_latex = 'gt')`

`options(modelsummary_factory_word = 'huxtable')`

`options(modelsummary_factory_png = 'gt')`

### Table themes

Change the look of tables in an automated and replicable way, using the `modelsummary`

theming functionality. See the vignette: https://vincentarelbundock.github.io/modelsummary/articles/appearance.html

`modelsummary_theme_gt`

`modelsummary_theme_kableExtra`

`modelsummary_theme_huxtable`

`modelsummary_theme_flextable`

`modelsummary_theme_dataframe`

### Model extraction functions

`modelsummary`

can use two sets of packages to extract information from
statistical models: the `easystats`

family (`performance`

and `parameters`

)
and `broom`

. By default, it uses `easystats`

first and then falls back on
`broom`

in case of failure. You can change the order of priorities or include
goodness-of-fit extracted by *both* packages by setting:

`options(modelsummary_get = "broom")`

`options(modelsummary_get = "easystats")`

`options(modelsummary_get = "all")`

### Formatting numeric entries

By default, LaTeX tables enclose all numeric entries in the `\num{}`

command
from the siunitx package. To prevent this behavior, or to enclose numbers
in dollar signs (for LaTeX math mode), users can call:

`options(modelsummary_format_numeric_latex = "plain")`

`options(modelsummary_format_numeric_latex = "mathmode")`

A similar option can be used to display numerical entries using MathJax in HTML tables:

`options(modelsummary_format_numeric_html = "mathjax")`

## Examples

```
library(modelsummary)
# crosstab of two variables, showing counts, row percentages, and row/column totals
datasummary_crosstab(cyl ~ gear, data = mtcars)
# crosstab of two variables, showing counts only and no totals
datasummary_crosstab(cyl ~ gear, statistic = ~ N, data = mtcars)
# crosstab of three variables
datasummary_crosstab(am * cyl ~ gear, data = mtcars)
# crosstab with two variables and column percentages
datasummary_crosstab(am ~ gear, statistic = ~ Percentage("col"), data = mtcars)
```

## References

Arel-Bundock V (2022). “modelsummary: Data and Model Summaries in R.” *Journal of Statistical Software*, *103*(1), 1-23. doi:10.18637/jss.v103.i01
.'