Datasaurus | R Documentation |

## The Datasaurus Dozen

### Description

An illustrative exercise in never trusting the summary statistics without
also visualizing them.

### Usage

```
Datasaurus
```

### Format

A data frame with 1,846 observations on the following 3 variables.

`dataset`

the particular data set, one of 12

`x`

a random variable

`y`

another random variable

### Details

Data were created by Alberto Cairo to illustrate you should always
visualize your data beyond the summary statistics. These are 12 data sets,
in long form, each with a mean of `x`

about 54.26, a mean of `y`

about 47.83. The standard deviation for `x`

is about 16.76 and the
standard deviation of `y`

is about 26.93. `x`

and `y`

will
correlate weakly, about -.06.

### Author(s)

Alberto Cairo, Justin Matejka, George Fitzmaurice

### References

Cairo, Alberto. 2016. “Download the Datasaurus: Never trust
summary statistics alone; always visualize your data”.
*URL:* http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html

Matejka, Justin and George Fitzmaurice. 2017. “Same Stats, Different Graphs: Generating Datasets
with Varied Appearance and Identical Statistics through Simulated Annealing.”
*ACM SIGCHI Conference on Human Factors in Computing Systems*.
*URL:* https://www.autodesk.com/research/publications/same-stats-different-graphs