Format columns of a data frame

Description

This function formats the columns of a data frame based on the column type (logical, date, numeric). It allows various formatting options like significant digits, decimal points, and scientific notation. It also includes custom formatting for date and boolean values. If this function is applied several times to the same cell, the last transformation is retained and the previous calls are ignored, except for the escape argument which can be applied to previously transformed data.

Usage

format_tt(
  x,
  i = NULL,
  j = NULL,
  digits = getOption("tinytable_format_digits", default = NULL),
  num_fmt = getOption("tinytable_format_num_fmt", default = "significant"),
  num_zero = getOption("tinytable_format_num_zero", default = FALSE),
  num_suffix = getOption("tinytable_format_num_suffix", default = FALSE),
  num_mark_big = getOption("tinytable_format_num_mark_big", default = ""),
  num_mark_dec = getOption("tinytable_format_num_mark_dec", default = getOption("OutDec",
    default = ".")),
  date = getOption("tinytable_format_date", default = "%Y-%m-%d"),
  bool = getOption("tinytable_format_bool", default = function(column)
    tools::toTitleCase(tolower(column))),
  other = getOption("tinytable_format_other", default = as.character),
  replace = getOption("tinytable_format_replace", default = TRUE),
  escape = getOption("tinytable_format_escape", default = FALSE),
  markdown = getOption("tinytable_format_markdown", default = FALSE),
  quarto = getOption("tinytable_format_quarto", default = FALSE),
  fn = getOption("tinytable_format_fn", default = NULL),
  sprintf = getOption("tinytable_format_sprintf", default = NULL),
  ...
)

Arguments

x A data frame or a vector to be formatted.
i Row indices where the formatting should be applied.
j

Column indices where the styling should be applied. Can be:

  • Integer vectors indicating column positions.

  • Character vector indicating column names.

  • A single string specifying a Perl-style regular expression used to match column names.

digits Number of significant digits or decimal places.
num_fmt The format for numeric values; one of ‘significant’, ‘significant_cell’, ‘decimal’, or ‘scientific’.
num_zero Logical; if TRUE, trailing zeros are kept in "decimal" format (but not in "significant" format).
num_suffix Logical; if TRUE display short numbers with digits significant digits and K (thousands), M (millions), B (billions), or T (trillions) suffixes.
num_mark_big Character to use as a thousands separator.
num_mark_dec Decimal mark character. Default is the global option ‘OutDec’.
date A string passed to the format() function, such as "%Y-%m-%d". See the "Details" section in ?strptime
bool A function to format logical columns. Defaults to title case.
other A function to format columns of other types. Defaults to as.character().
replace

Logical, String or Named list of vectors

  • TRUE: Replace NA by an empty string.

  • FALSE: Print NA as the string "NA".

  • String: Replace NA entries by the user-supplied string.

  • Named list: Replace matching elements of the vectors in the list by theirs names. Example:

    • list(“-” = c(NA, NaN), “Tiny” = -Inf, “Massive” = Inf)

escape

Logical or "latex" or "html". If TRUE, escape special characters to display them as text in the format of the output of a tt() table.

  • If i and j are both NULL, escape all cells, column names, caption, notes, and spanning labels created by group_tt().

markdown Logical; if TRUE, render markdown syntax in cells. Ex: italicized text is properly italicized in HTML and LaTeX.
quarto Logical. Enable Quarto data processing and wrap cell content in a data-qmd span (HTML) or macro (LaTeX). See warnings in the Global Options section below.
fn Function for custom formatting. Accepts a vector and returns a character vector of the same length.
sprintf String passed to the ?sprintf function to format numbers or interpolate strings with a user-defined pattern (similar to the glue package, but using Base R).
Additional arguments are ignored.

Value

A data frame with formatted columns.

Global options

Row names

When the x data frame includes row names, tinytable can bind them to the first column (without an empty string string as column name). This global option triggers this behavior:

options(tinytable_tt_rownames = TRUE)

x <- mtcars[1:3, 1:3]
tt(x)

options(tinytable_tt_rownames = FALSE)

Quarto data processing

The format_tt(quarto=TRUE) argument activates Quarto data processing for specific cells. This funcationality comes with a few warnings:

  1. Currently, Quarto provides a LaTeX macro, but it does not appear to do anything with it. References and markdown codes may not be processed as expected in LaTeX.

  2. Quarto data processing can enter in conflict with tinytable styling or formatting options. See below for how to disable it.

options(tinytable_quarto_disable_processing = TRUE)

Disable Quarto processing of cell content. Setting this global option to FALSE may lead to conflicts with some tinytable features, but it also allows use of markdown and Quarto-specific code in table cells, such as cross-references.

x <- data.frame(Math = "x^2^", Citation = "@Lovelace1842")
fn <- function(z) sprintf("<span data-qmd='%s'></span>", z)
tt(x) |> format_tt(i = 1, fn = fn)

See this link for more details: https://quarto.org/docs/authoring/tables.html#disabling-quarto-table-processing

HTML

  • EXPERIMENTAL options(tinytable_html_mathjax = TRUE) inserts MathJax scripts in the HTML document. Warning: This may conflict with other elements of the page if MathJax is otherwise loaded.

PDF

  • options(tinytable_save_pdf_clean = TRUE) deletes temporary and log files.

  • options(tinytable_save_pdf_engine = “xelatex”): "xelatex", "pdflatex", "lualatex"

Examples

library("tinytable")

dat <- data.frame(
  a = rnorm(3, mean = 10000),
  b = rnorm(3, 10000))
tab <- tt(dat)
format_tt(tab,
 digits = 2,
 num_mark_dec = ",",
 num_mark_big = " ")
a b
10 001 10 000
9 999 10 000
10 001 10 000
k <- tt(data.frame(x = c(0.000123456789, 12.4356789)))
format_tt(k, digits = 2, num_fmt = "significant_cell")
x
0.00012
12
dat <- data.frame(
   a = c("Burger", "Halloumi", "Tofu", "Beans"),
   b = c(1.43202, 201.399, 0.146188, 0.0031),
   c = c(98938272783457, 7288839482, 29111727, 93945))
tt(dat) |>
 format_tt(j = "a", sprintf = "Food: %s") |>
 format_tt(j = 2, digits = 1, num_fmt = "decimal", num_zero = TRUE) |>
 format_tt(j = "c", digits = 2, num_suffix = TRUE)
a b c
Food: Burger 1.4 99T
Food: Halloumi 201.4 7.3B
Food: Tofu 0.1 29M
Food: Beans 0.0 94K
y <- tt(data.frame(x = c(123456789.678, 12435.6789)))
format_tt(y, digits=3, num_mark_big=" ")
x
123 456 790
12 436
x <- tt(data.frame(Text = c("_italicized text_", "__bold text__")))
format_tt(x, markdown=TRUE)
Text
italicized text
bold text
tab <- data.frame(a = c(NA, 1, 2), b = c(3, NA, 5))
tt(tab) |> format_tt(replace = "-")
a b
- 3
1 -
2 5
dat <- data.frame(
   "LaTeX" = c("Dollars $", "Percent %", "Underscore _"),
   "HTML" = c("<br>", "<sup>4</sup>", "<emph>blah</emph>")
)
tt(dat) |> format_tt(escape = TRUE)   
LaTeX HTML
Dollars $ <br>
Percent % <sup>4</sup>
Underscore _ <emph>blah</emph>