---
title: "Why use difference-adjusted confidence intervals?"
bibliography: "../inst/REFERENCES.bib"
csl: "../inst/apa-6th.csl"
output: 
  rmarkdown::html_vignette
description: >
  This vignette explains the difference between a regular confidence interval
  and a difference-adjusted confidence interval.
vignette: >
  %\VignetteIndexEntry{Why use difference-adjusted confidence intervals?}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---


```{r, echo = FALSE, warning=FALSE, message = FALSE, results = 'hide'}
cat("this will be hidden; use for general initializations.")
library(superb)
library(ggplot2)
```

Most studies examine the effect of a certain factor on a measure. However, very rarely 
do we have hypotheses stating what would be the expected result. Instead, we have a 
control group to which the results of the treatment group will be compared [@c17].

## A paradoxical example

Imagine a study in a school examining the impact of playing collaborative games 
before beginning the classes. This study most likely will have two groups, one where
the students are playing collaborative games and one where the students will have non-
structured activities prior to classes. The objective of the study is to
*compare* the two groups.

Consider the results obtained. The measurement instrument tends to return scores
near 100.

```{r, message=FALSE, echo=FALSE, fig.height=4, fig.width=3, fig.cap="**Figure 1**. Mean scores along with 95% confidence interval for two groups of students on the quality of learning behavior."}
superb(
    score ~ grp, 
    dataFigure1, 
    plotStyle="line" ) + 
xlab("Group") + ylab("Score") + 
labs(title="(stand-alone)\n95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) +
theme_gray(base_size=10) +
scale_x_discrete(labels=c("1" = "Collaborative\ngames", "2" = "Unstructured\nactivity"))
```

As seen, there seems to be a better score for students playing collaborative games. 
Taking into account the confidence interval, the manipulation seems to improve
significantly the learning behavior as the lower end of the *Collaborative games* interval is above the *Unstructured activity* mean (and vice versa).

What a surprise to discover that a *t* test will **NOT** confirm this impression (t(48) = 1.76, p = .085):

```{r, message=FALSE, warning=FALSE, echo=TRUE}
t.test(dataFigure1$score[dataFigure1$grp==1], 
        dataFigure1$score[dataFigure1$grp==2],
        var.equal=T)
```

## The origin of the paradox 

The reason is that the confidence intervals used are "stand-alone": They can be used to
examine, say, the first group to the value 100. As this value is outside the interval,
we are correct in concluding that the first group's mean is significantly different
from 100 (with level $\alpha$  of .05) from 100, as confirmed by a 
single-group *t* test:

```{r, message=FALSE, echo=TRUE}
t.test(dataFigure1$score[dataFigure1$grp==1], mu=100)
```

Likewise, the second group' mean is significantly different from 105 (which happens
to be the first group's mean):

```{r, message=FALSE, echo=TRUE}
t.test(dataFigure1$score[dataFigure1$grp==2], mu=105)
```

This is precisely the purpose of **stand-alone** confidence intervals: to compare a 
single result to a fix value. The fix value (here 100 for the first group 
and 105 for the second group) has no uncertainty, it is a constant. 

In contrast, the two-group *t* test compares two means, the two of which are
uncertain. Therefore, in making a confidence interval, it is necessary that 
the basic, stand-alone, confidence interval be *informed* that it is going to 
be compared ---not to a fix value--- to a second quantity which is itself 
uncertain.

Using the language of analyse of variances, we can say that when the purpose of the plot
is to compare means to other means, there is more variances in the comparisons than there
is in single groups in isolation.

## Adjusting the error bars

Assuming that the variances are roughly homogeneous between group (an assumption
made by the *t* test, but see below), there is a simple adjustment that can 
be brought to the 
error bars: just increase their length by $\sqrt{2}$. As $\sqrt{2} \approx 1.41$ 
it means increasing their length by 41%.

With ``superbPlot``, this so-called *difference adjustment* [@b12] is obtained easily
by adding an adjustment to the list of adjustments with ``adjustments = list(purpose = 
"difference")``, as seen below.

```{r, message=FALSE, echo=TRUE, fig.height=4, fig.width=3, fig.cap="**Figure 2**. Mean scores along with difference-adjusted 95% confidence interval for two groups of students on the quality of learning behavior."}
superb(
    score ~ grp, 
    dataFigure1, 
    adjustments=list(purpose = "difference"),  # the only new thing here
    plotStyle="line" ) + 
xlab("Group") + ylab("Score") + 
labs(title="Difference-adjusted\n95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) + 
theme_gray(base_size=10) +
scale_x_discrete(labels=c("1" = "Collaborative\ngames", "2" = "Unstructured\nactivity"))
```

This is where the usefulness of ``superb`` is apparent: it only required an 
option to mutate the plot from a plot showing means with stand-alone confidence
intervals to a plot showing means with difference-adjusted confidence intervals [@cgh21].

## Illustrating the impact of the adjustments

Just for comparison purposes, let's show both plots side-by-side.

```{r, message=FALSE, echo=TRUE, fig.height=4, fig.cap="**Figure 3**. Two representation of the data with unadjusted (left) and adjusted (right) 95% confidence intervals"}
library(gridExtra)
plt1 <- superb(
    score ~ grp, 
    dataFigure1, 
    plotStyle="line" ) + 
xlab("Group") + ylab("Score") + 
labs(title="(stand-alone)\n95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) +
theme_gray(base_size=10) +
scale_x_discrete(labels=c("1" = "Collaborative\ngames", "2" = "Unstructured\nactivity")) 

plt2 <- superb(
    score ~ grp, 
    dataFigure1, 
    adjustments=list(purpose = "difference"), 
    plotStyle="line" ) + 
xlab("Group") + ylab("Score") + 
labs(title="Difference-adjusted\n95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) + 
theme_gray(base_size=10) +
scale_x_discrete(labels=c("1" = "Collaborative\ngames", "2" = "Unstructured\nactivity")) 

plt <- grid.arrange(plt1, plt2, ncol=2)
```

A second way to compare the two plots is to superimpose them, as in Figure 4:

```{r, message=FALSE, echo=TRUE, fig.height=4, fig.width=6, fig.cap="**Figure 4**. Two representations of the results with adjusted and unadjusted error bars on the same plot"}
# generate the two plots, nudging the error bars, using distinct colors, and 
# having the second plot's background transparent (with ``makeTransparent()`` )
plt1 <- superb(
    score ~ grp,
    dataFigure1, 
    errorbarParams = list(color="blue",position = position_nudge(-0.05) ),
    plotStyle="line" ) + 
xlab("Group") + ylab("Score") + 
labs(title="(red) Difference-adjusted 95% confidence intervals\n(blue) (stand-alone) 95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) +
theme_gray(base_size=10) +
scale_x_discrete(labels=c("1" = "Collaborative\ngames", "2" = "Unstructured\nactivity")) 


plt2 <- superb(
    score ~ grp,
    dataFigure1, 
    adjustments=list(purpose = "difference"), 
    errorbarParams = list(color="red",position = position_nudge(0.05) ),
    plotStyle="line" ) + 
xlab("Group") + ylab("Score") + 
labs(title="(red) Difference-adjusted 95% confidence intervals\n(blue) (stand-alone) 95% confidence intervals") +
coord_cartesian( ylim = c(85,115) ) + 
theme_gray(base_size=10) +
scale_x_discrete(labels=c("1" = "Collaborative\ngames", "2" = "Unstructured\nactivity"))

# transform the ggplots into "grob" so that they can be manipulated
plt1g <- ggplotGrob(plt1)
plt2g <- ggplotGrob(plt2 + makeTransparent() )

# put the two grob onto an empty ggplot (as the positions are the same, they will be overlayed)
ggplot() + 
    annotation_custom(grob=plt1g) + 
    annotation_custom(grob=plt2g)
```

As seen, the ``difference-adjusted`` error bars are wider. This is to be
expected: their purposes (comparing two means) introduces more variability,
and variability always reduces precision. 

## Two options

There are two methods to adjust for the purpose of the error bars:

* ``"difference"``: This method is the simplest. It increases the 
error bars by a factor of $\sqrt{2}$ on the premise that the variances
are homogeneous

* ``"tryon"``: This method, proposed in @t01, is used when the variances
are inhomogeneous. It replaces the $\sqrt{2}$ correction factor by a 
factor $2 \times E$ based on the heterogeneity of the variances. In the
case where the error bars are roughly homogeneous, there is no visible 
difference with ``"difference"``. See[Vignette  7]
(https://dcousin3.github.io/superb/articles/Vignette7.html) for more

The option ``"single"`` is used if the purpose is obtain "stand-alone" 
error bars or error bars that are to be compared to an *a priori* 
determine value. Such error bars are inapt to perform pair-wise 
comparisons.


## In conclusion

Adjusting the confidence intervals is important to have coherence between the test 
and the figure. If you are to claim that there is no difference but show Figure 1,
an examinator (you know, a reviewer) may raise a red flag and cast doubt on your
conclusions (opening the door to many rounds of reviews or rejection if this is a 
submitted work). 

Having coherence between the figures and the tests reported in your document is one
way to improve the clarity of your work. Coherence here comes cheap: You just need 
to add in the figure caption "Difference-adjusted" before "95% confidence intervals".


# References