---
title: "Summarytools in R Markdown Documents"
author: "Dominic Comtois"
date: "`r Sys.Date()`"
output:
html_document:
fig_caption: false
toc: true
toc_depth: 1
css: assets/vignette.css
df_print: default
vignette: >
%\VignetteIndexEntry{Summarytools in R Markdown Documents}
%\VignetteEncoding{UTF-8}
%\VignetteDepends{kableExtra}
%\VignetteDepends{magrittr}
%\VignetteEngine{knitr::rmarkdown}
---
```{r, echo=FALSE, results='asis'}
summarytools::st_css(main = TRUE, global = TRUE)
```
```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(comment = NA,
prompt = FALSE,
cache = FALSE,
echo = TRUE,
results = 'asis')
library(summarytools)
st_options(bootstrap.css = FALSE, # Already part of the theme so no need for it
plain.ascii = FALSE, # One of the essential settings
style = "rmarkdown", # Idem.
dfSummary.silent = TRUE, # Suppresses messages about temporary files
footnote = NA, # Keeping the results minimalist
descr.silent = TRUE, # To avoid messages when building / checking
subtitle.emphasis = FALSE) # For the vignette theme, this gives better results.
# For other themes, using TRUE might be preferable.
```
# 1. Introduction
This document mainly contains examples showing how best to use
**summarytools** in *R Markdown* documents. For a more in-depth view
of the package's features, please see `vignette("introduction", "summarytools")`
- the online version can be found
[here](https://cran.r-project.org/package=summarytools/vignettes/introduction.html).
## 1.1 Methods vs Styles
Every time we display **summarytools** objects with `print()`, `view()`,
or `stview()`, we pick -- explicitly or not -- one of several display methods.
Possible display methods are: *pander*, *render*, *viewer*, and *browser*. It
is one of the parameters for `print.summarytools()` and `view()`
(alias: `stview()`).
Since methods *viewer* and *browser* are mostly
meant for interactive work and rely on the same underlying code as *render*,
we will assume for the purpose of this document that there are really only two
methods: *pander* and *render*.
### Only the *pander* Method Uses Styles
The *pander* method is used by default when results are automatically printed
to the console, or when we use `print()` without an explicit `method`
argument.
The *style* parameter is communicated to **pander** (see `?pander::pander`
or visit its [GitHub page](https://github.com/Rapporter/pander) to learn more
on this very useful package).
 |
When we use any of the *viewer*, *browser*, or *render* methods, the
package uses **htmltools** to generate results; any specified *styles*
are thus ignored.
|
### **summarytools** styles are **pander** styles
Available styles are the ones supported by **pander**:
- simple (default, used mainly in R console)
- rmarkdown (used by all core functions except `dfSummary()`)
- grid (mainly used with `dfSummary()`)
- multiline (can be used with `dfSummary()` if you use *ascii* graphs only)
- jira (more recent addition, not thoroughly tested)
## 1.2 General Guidelines
**Always** set **results='asis'** either explicitly on a chunk-by-chunk basis
or by including `opts_chunk$set(results = 'asis')` in your setup chunk.
Also, don't forget to specify **`plain.ascii = FALSE`** in all function calls
using the *pander* method. It is advised to set this option, as well as the
`style` option in the setup chunk:
```{r, eval=FALSE}
st_options(plain.ascii = FALSE, style = "rmarkdown")
```
 |
If you get repeated, unhelpful warnings, use chunk options `message = FALSE`
and/or `warning = FALSE`. Another option is to use the argument `silent = TRUE`
to the `print()` method or `view()` / `stview()` functions. See `?st_options`
to set this globally for individual functions.
|
The following table indicates which method / style is better suited for each
**summarytools** function in the context of R Markdown documents:
| Function | render method | pander method | pander style |
|:------------|:-------------:|:-------------:|:-------------|
| freq() | ✓ | ✓ | rmarkdown |
| ctable() | ✓ | Sub-optimal | rmarkdown |
| descr() | ✓ | ✓ | rmarkdown |
| dfSummary() | ✓ | ✓ | grid |
**Recommended Style When Using *pander* method**
For `freq()`, `descr()`, and `ctable()`, _rmarkdown_ style is recommended.
For `dfSummary()`, _grid_ is recommended. Note that _multiline_ can also
be used, but only _ascii_ graphs will be displayed.
Starting with `freq()`, we'll now review the recommended methods and styles to
get satisfying results in *R Markdown* documents.
--------------------------------------------------------------------------------
# 2. Using freq() in R Markdown
`freq()` is best used with method "pander" (default), `style = "rmarkdown"`;
*html* rendering is also possible.
## 2.1 Pander Style for freq()
With `method = "pander"`, `style = "rmarkdown"` is the easy winner. Since
"pander" is the default method, you can usually omit the call to
`print()`. But to make things as clear as possible, we'll include it here.
```{r}
print(freq(tobacco$gender,
plain.ascii = FALSE,
style = "rmarkdown"),
method = "pander")
```
## 2.2 HTML Rendering for freq()
There are rarely any problems when using the *render* method to display
`freq()` results.
```{r}
print(freq(tobacco$gender), method = "render")
```
If you find the table is too large, you can use `table.classes = "st-small"`:
```{r, message=FALSE}
print(descr(tobacco), method = "render", table.classes = "st-small")
```
--------------------------------------------------------------------------------
Back to top
# 3. Using ctable() in R Markdown
## 3.1 Rmarkdown Style for ctable()
Tables with multi-row headings are not fully supported in *markdown*
(yet), but the result is close to acceptable. This, however, is not
true for all themes. That is why the rendering method is preferred.
```{r}
ctable(tobacco$gender,
tobacco$smoker,
plain.ascii = FALSE,
style = "rmarkdown")
```
## 3.2 HTML Rendering for ctable()
For best results, use this method.
```{r ctable_html}
print(ctable(tobacco$gender, tobacco$smoker), method = "render")
```
--------------------------------------------------------------------------------
Back to top
# 4. Using descr() in R Markdown
`descr()` gives good results with both `style = "rmarkdown"` and *html*
rendering.
## 4.1 Rmarkdown Style for descr()
```{r}
descr(tobacco, plain.ascii = FALSE, style = "rmarkdown")
```
## 4.2 HTML Rendering for descr()
We'll use `table.classes = "st-small"` to show how it affects the table's size,
compared to the `freq()` table rendered earlier.
We'll also use `message = FALSE` as chunk option to avoid the message saying
that non-numerical variables have been ignored.
```{r, message=FALSE}
print(descr(tobacco), method = "render", table.classes = "st-small")
```
--------------------------------------------------------------------------------
Back to top
# 5. Using dfSummary() in R Markdown
To get optimal results, whichever method you choose, it is always best to
omit at least 1, and if possible 2 columns from the output. Also, pick
carefully the value of the `graph.magnif` parameter.
## 5.1 Grid Style for dfSummary()
Don't forget to specify `plain.ascii = FALSE` (or set it as a global
option with `st_options(plain.ascii = FALSE)`), or you won't get good results.
(Note: The following output is an image (screenshot). This is because CRAN
doesn't allow writing in "/tmp" or any directory other than R's temp directory,
which would pose problems in terms of column widths. The introductory vignette
explains this issue in more details.)
```{r dfs_grid, eval=FALSE}
dfSummary(tobacco,
plain.ascii = FALSE,
style = "grid",
graph.magnif = 0.75,
varnumbers = FALSE,
valid.col = FALSE,
tmp.img.dir = "/tmp")
```
### 4.2 HTML Rendering for dfSummary()
This method works really well, and not having to specify the `tmp.img.dir`
parameter is a plus.
```{r}
print(dfSummary(tobacco,
varnumbers = FALSE,
valid.col = FALSE,
graph.magnif = 0.75),
method = "render")
```
## 4.3 Managing Lengthy dfSummary() Outputs in R Markdown Documents
For data frames containing numerous variables, we can use the `max.tbl.height`
argument to wrap the results in a scrollable window having the specified
height, in pixels.
```{r}
print(dfSummary(tobacco,
varnumbers = FALSE,
valid.col = FALSE,
graph.magnif = 0.75),
max.tbl.height = 300,
method = "render")
```
 |
Some users reported getting repeated X11 warnings; those can easily be
avoided by using the following chunk expression:
`{r, results="asis", warning=FALSE}`.
|
Back to top
--------------------------------------------------------------------------------
# 5. Using Other Formatting Packages
As explained in the introductory vignette, `tb()` can be used to convert
**summarytools** objects created with `freq()` and `descr()` to simple
*tibbles*, which packages specialized in table formatting will be able
to process. This is particularly helpful with `stby` objects:
```{r}
library(kableExtra)
library(magrittr)
stby(iris, iris$Species, descr, stats = "fivenum") %>%
tb() %>%
kable(format = "html", digits = 2) %>%
collapse_rows(columns = 1, valign = "top")
```
Using `tb(order = 3)` flips the order of the grouping variable(s) and the
reported variable(s):
```{r}
stby(iris, iris$Species, descr, stats = "fivenum") %>%
tb(order = 3) %>%
kable(format = "html", digits = 2) %>%
collapse_rows(columns = 1, valign = "top")
```
Back to top
--------------------------------------------------------------------------------
# 6. Including dfSummaries in PDF Documents
Here is a recipe for including fully formatted data frame summaries in *pdf*
documents. There is some work involved, but carefully following the instructions
given here should give the expected results.
There are basically two parts to this: first, you must create a preamble *tex*
file. Second, you must indicate in the *YAML* section of your document where to
find this file.
### Included Preamble *Tex* File
This is the \LaTeX content that needs to be included as preamble. You can
either copy this into your own *tex* file, or use the file that is now
included in **summarytools** (as of version 1.0), following the instructions
provided below.
````
\usepackage{graphicx}
\usepackage[export]{adjustbox}
\usepackage{letltxmacro}
\LetLtxMacro{\OldIncludegraphics}{\includegraphics}
\renewcommand{\includegraphics}[2][]{\raisebox{0.5\height}%
{\OldIncludegraphics[valign=t,#1]{#2}}}
````
If you choose to create a *tex* file from the above content, the name of the
file is arbitrary -- you can use whatever name you want. Its
location is also up to you. I suggest you put it in the same location as your
*Rmd* file.
Along with the `graph.magnif` parameter for `dfSummary()`, you might need to
adjust the `0.5` value used as `raisebox` parameter in the preamble.
### The YAML Section
Your document should start with a *YAML* header like this one:
```
---
title: "My PDF With Data Frame Summaries"
output:
pdf_document:
latex_engine: xelatex
includes:
in_header:
- !expr system.file("includes/fig-valign.tex", package = "summarytools")
---
```
If you need to customize the content of the preamble, then your header will
look something like this (assuming it is in the same directory as your *Rmd*
document):
```
---
title: "My PDF With Data Frame Summaries"
output:
pdf_document:
latex_engine: xelatex
includes:
in_header: fig-valign-modified.tex
---
```
 |
The *xelatex* engine option is not mandatory, but there are several
advantages to it. I use it systematically and recommend you do the same. |
### R Code
Here is an example setup chunk:
````
```{r, message=FALSE}`r ''`
library(summarytools)
st_options(
plain.ascii = FALSE,
style = "rmarkdown",
dfSummary.style = "grid",
dfSummary.valid.col = FALSE,
dfSummary.graph.magnif = .52,
subtitle.emphasis = FALSE,
tmp.img.dir = "/tmp"
)
```
````
And here is a chunk actually creating the summary:
````
```{r, results='asis', message=FALSE}`r ''`
define_keywords(title.dfSummary = "Data Frame Summary in PDF Format")
dfSummary(tobacco)
```
````
### Remarks
Since we redefined the $\LaTeX$ command `includegraphics`, all images included
using `[](some-image.png)` will be impacted. In some cases, this could pose
a problem. Eventually, we hope to find a more robust solution, without
such side-effects. (If you are well versed in $\LaTeX$ and think you can
solve this problem, please get in touch.)
--------------------------------------------------------------------------------
# 7. This Vignette's Setup
This vignette uses theme `rmarkdown::html_vignette`. Its *YAML* section
looks like this:
```
---
title: "Summarytools in R Markdown Documents"
author: "Dominic Comtois"
date: "`r Sys.Date()`"
output:
html_document:
fig_caption: false
toc: true
toc_depth: 1
css: assets/vignette.css
vignette: >
%\VignetteIndexEntry{Summarytools in R Markdown Documents}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
%\VignetteDepends{magrittr}
%\VignetteDepends{kableExtra}
---
```
The *vignette.css* file is copied from the installed **rmarkdown** package's
'templates/html_vignette/resources' directory.
### Global Options
The following **global options** for **knitr** and **summarytools** have been
set. Other options might also be useful to optimize content, but this is a
good place to start from.
````
```{r setup, include=FALSE}`r ''`
library(knitr)
opts_chunk$set(comment=NA,
prompt=FALSE,
cache=FALSE,
echo=TRUE,
results='asis')
st_options(bootstrap.css = FALSE, # Already part of the theme
plain.ascii = FALSE, # Essential setting for Rmd
style = "rmarkdown", # Essential setting for Rmd
dfSummary.silent = TRUE, # Hides redundant messages
footnote = NA, # Keeping the results minimal
subtitle.emphasis = FALSE) # For the vignette theme,
# this gives better results.
# For other themes, using
# TRUE might be preferable.
```
````
Finally, **summarytools CSS** has been included in the following manner, before
the setup chunck:
````
```{r, echo=FALSE, results='asis'}`r ''`
summarytools::st_css(main = TRUE, global = TRUE)
```
````
--------------------------------------------------------------------------------
# 8. Final Notes
This is by no way a definitive guide; depending on the themes you use,
you could find that other settings yield better results. If you are looking
to create a _Word_ or a _PDF_ document, you might want to try different
combinations of options. If you find problems with the recommended settings or
if you find better combinations, you are welcome to
[open an issue on GitHub](https://github.com/dcomtois/summarytools/issues)
to suggest modifications or make a pull request with your own improvements to
this vignette.
Back to top