The purpose of this post is to show you how to use two cool packages (afex and lsmeans) to easily analyse any factorial experiment.

Background
In psychological research, the analysis of variance (ANOVA) is an extremely popular method. Many designs involve the assignment of participants into one of several groups (often denoted as treatments) where one is interested in differences between those treatments. Besides such between-subjects designs where each participant is part of only one treatment, there are also within-subjects designs where each participant serves in every treatment.

A few years ago, the analysis of such designs in R was pretty nasty for several reasons:

1. There was no easy syntax for the within-subjects ANOVA.
2. The differences compared to other software such as SPSS (Type III sums of squares)
3. The setting of the appropriate contrasts

### afex for ANOVA designs

The package afex (analysis of factorial experiments), mainly written by Henrik Singmann, eases the process substantially. For all kinds of AN(C)OVA designs (between, within, mixed), you basically need only one function.

### lsmeans for contrasts and post-hoc tests

lsmeans is a package to test contrasts for many linear, generalized linear and mixed models. The cool thing: Since lately, both afex and lsmeans work smoothly together.

Install packages

You obtain the latest version of afex (as well as lsmeans) from github:

devtools::install_github("singmann/afex@master")
packageVersion("afex")

An example

I will use the dataset obk.long (available in the car package) for demonstration purposes. Type this for information on this dataset:

library(afex)
help("OBrienKaiser",package="car")

From the description:
The data are from an imaginary study in which 16 female and male subjects, who are divided into three treatments, are measured at a pretest, postest, and a follow-up session; during each session, they are measured at five occasions at intervals of one hour. The design, therefore, has two between-subject and two within-subject factors

data(obk.long)
head(obk.long)
  id treatment gender   age phase hour value
1  1   control      M -4.75   pre    1     1
2  1   control      M -4.75   pre    2     2
3  1   control      M -4.75   pre    3     4
4  1   control      M -4.75   pre    4     2
5  1   control      M -4.75   pre    5     1
6  1   control      M -4.75  post    1     3

### afex

Run the analysis

We use the aov_ez() function to fit a mixed ANOVA with treatment as between-subjects factor, and phase as within-subjects factors. We set return to “nice”. I will explain this in a minute.

(fit_nice <- aov_ez("id","value",obk.long,between=c("treatment"),within=c("phase"),return="nice"))
Warning: More than one observation per cell, aggregating the data using mean (i.e, fun.aggregate = mean)!
Contrasts set to contr.sum for the following variables: treatment
Anova Table (Type 3 tests)

Response: value
Effect    df  MSE         F ges p.value
1       treatment 2, 13 6.41    2.91 + .27     .09
2           phase 2, 26 0.71 19.29 *** .21  <.0001
3 treatment:phase 4, 26 0.71   5.43 ** .13    .003

As you see, contrasts are automatically set to effect-coding (contr.sum) and, since we have more than one observation per cell, the data were automatically aggregated.

The output is similar to the usual analysis of variance table with three differences. We have two additional columns showing the mean square error (MSE) and generalized $$^{2}$$ (ges) as an effect size measure. Also, p-values are displayed nicer.

If we don’t use the return argument, the object storing the result is a list with 6 elements. This is fine for small datasets where speed does not matter, but the more data you have, the more expensive it becomes to not specify a return argument.

# use all
fit_all <- aov_ez("id","value",obk.long,between=c("treatment"),within=c("phase"))
Warning: More than one observation per cell, aggregating the data using mean (i.e, fun.aggregate = mean)!
Contrasts set to contr.sum for the following variables: treatment
names(fit_all)
[1] "anova_table" "aov"         "Anova"       "lm"          "data"       
1. The output from the Anova() function (package: car)
2. The output from the aov() function in base R
3. MANOVA for repeated measures
4. Output from function lm() (DV = matrix with 3 columns for each level of the wihin factor)
5. the data in wide and long format

We need to call summary() to get a result.

summary(fit_all)

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

SS num Df Error SS den Df        F    Pr(>F)
(Intercept)     1369.31      1   83.317     13 213.6557 1.899e-09 ***
treatment         37.35      2   83.317     13   2.9139  0.090041 .
phase             27.36      2   18.433     26  19.2939 7.289e-06 ***
treatment:phase   15.40      4   18.433     26   5.4304  0.002578 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Mauchly Tests for Sphericity

Test statistic p-value
phase                  0.85151 0.38119
treatment:phase        0.85151 0.38119

Greenhouse-Geisser and Huynh-Feldt Corrections
for Departure from Sphericity

GG eps Pr(>F[GG])
phase           0.87071  2.375e-05 ***
treatment:phase 0.87071   0.004301 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

HF eps   Pr(>F[HF])
phase           0.9939032 7.706072e-06
treatment:phase 0.9939032 2.640817e-03

As you see, the output shows the results for a RM-ANOVA assuming sphericity. In addition, Mauchly Test for Sphercity as well as Greenhouse Geisser and Huynh-Feldt corrected p-values were computed for the respective effects.

So far so good, we can also use the mixed() function to fit the same design using a linear mixed model. Output is similar.

(fit_mixed <- mixed(value~treatment*phase+(phase|id),data=obk.long))
Contrasts set to contr.sum for the following variables: treatment, phase, id
Fitting 4 (g)lmer() models:
[....]
Obtaining 3 p-values:
[
Note: method with signature ‘sparseMatrix#ANY’ chosen for function ‘kronecker’,
target signature ‘dgCMatrix#ngCMatrix’.
"ANY#sparseMatrix" would also be valid
...]
Effect       df F.scaling         F p.value
1       treatment    2, 13      1.00    2.91 +     .09
2           phase    2, 12      0.92 22.69 ***  <.0001
3 treatment:phase 4, 13.64      0.90   5.22 **    .009

### lsmeans

Compute reference grid

In a first step, we have to create a reference grid showing all cell means as well as standard errors, degrees of freedom and the lower and upper limits of the 95-confidence interval.

library(lsmeans)
(ref <- lsmeans(fit_all,specs = c("treatment","phase")))
 treatment phase   lsmean        SE    df lower.CL upper.CL
control   fup   4.416667 0.7149811 19.08 2.920602 5.912732
A         fup   7.266667 0.7552721 20.00 5.691185 8.842148
B         fup   7.302381 0.6659561 17.80 5.902140 8.702622
control   post  4.016667 0.7149811 19.08 2.520602 5.512732
A         post  6.516667 0.7552721 20.00 4.941185 8.092148
B         post  6.588095 0.6659561 17.80 5.187854 7.988336
control   pre   4.216667 0.7149811 19.08 2.720602 5.712732
A         pre   5.016667 0.7552721 20.00 3.441185 6.592148
B         pre   4.159524 0.6659561 17.80 2.759283 5.559765

Confidence level used: 0.95 

Visualise means

library(ggplot2)
Warning: package ‘ggplot2’ was built under R version 3.2.4
ref_df <- as.data.frame(summary(ref))
pd <- position_dodge(0.1)
g4 <- ggplot(ref_df, aes(x=phase, y=lsmean,group=treatment,colour=treatment))+
geom_errorbar(aes(ymin=lsmean-SE, ymax=lsmean+SE), width=.1,position=pd) +
geom_line(position=pd)+
geom_point(position=pd)+theme_classic()
print(g4)

Testing contrasts

Let’s see if experimental groups (A and B) differ from the control group at the post-testing.

We create a vector with contrast weights on the basis of the reference grid:

# AB versus C at phase level post
test1 <- c(0,0,0,-1,0.5,0.5,0,0,0)
# control pre versus control post
test2 <- c(0,0,0,1,0,0,-1,0,0)

Now test both contrasts:

summary(contrast(ref,list(AB_vs_C = test1,CPre_vs_CPost = test2)))
 contrast       estimate        SE    df t.ratio p.value
AB_vs_C        2.535714 0.8820620 18.93   2.875  0.0097
CPre_vs_CPost -0.200000 0.5325314 26.00  -0.376  0.7103

Pairwise comparisons

Imagine we wanted to test all pairwise comparisons of treatment. Easy with lsmeans:

(ref2 <- lsmeans(fit_all,specs = c("treatment")))
NOTE: Results may be misleading due to involvement in interactions
 treatment   lsmean        SE df lower.CL upper.CL
control   4.216667 0.6454983 13 2.822152 5.611181
A         6.266667 0.6725128 13 4.813791 7.719542
B         6.016667 0.6131690 13 4.691996 7.341338

Results are averaged over the levels of: phase
Confidence level used: 0.95 
contrast(ref2,method="pairwise")
 contrast    estimate        SE df t.ratio p.value
control - A    -2.05 0.9804826 13  -2.091  0.1304
control - B    -1.80 0.8558354 13  -2.103  0.1278
A - B           0.25 0.9161171 13   0.273  0.9599

Results are averaged over the levels of: phase
P value adjustment: tukey method for comparing a family of 3 estimates 

Here, the familywise error rate is automatically adjusted for using the Tukey method. We can use another method with the adjust argument:

contrast(ref2,method="pairwise",adjust="bonferroni")
 contrast    estimate        SE df t.ratio p.value
control - A    -2.05 0.9804826 13  -2.091  0.1703
control - B    -1.80 0.8558354 13  -2.103  0.1665
A - B           0.25 0.9161171 13   0.273  1.0000

Results are averaged over the levels of: phase
P value adjustment: bonferroni method for 3 tests 

More pairwise comparisons

Say we wanted to test all pairwise comparisons within each level of phase.

(ref3 <- lsmeans(fit_all,~treatment|phase))
phase = fup:
treatment   lsmean        SE    df lower.CL upper.CL
control   4.416667 0.7149811 19.08 2.920602 5.912732
A         7.266667 0.7552721 20.00 5.691185 8.842148
B         7.302381 0.6659561 17.80 5.902140 8.702622

phase = post:
treatment   lsmean        SE    df lower.CL upper.CL
control   4.016667 0.7149811 19.08 2.520602 5.512732
A         6.516667 0.7552721 20.00 4.941185 8.092148
B         6.588095 0.6659561 17.80 5.187854 7.988336

phase = pre:
treatment   lsmean        SE    df lower.CL upper.CL
control   4.216667 0.7149811 19.08 2.720602 5.712732
A         5.016667 0.7552721 20.00 3.441185 6.592148
B         4.159524 0.6659561 17.80 2.759283 5.559765

Confidence level used: 0.95 
comps <- contrast(ref3,method="pairwise")
summary(comps)
phase = fup:
contrast       estimate       SE    df t.ratio p.value
control - A -2.85000000 1.083531 18.93  -2.630  0.0418
control - B -2.88571429 0.945783 18.93  -3.051  0.0173
A - B       -0.03571429 1.012400 18.93  -0.035  0.9993

phase = post:
contrast       estimate       SE    df t.ratio p.value
control - A -2.50000000 1.083531 18.93  -2.307  0.0792
control - B -2.57142857 0.945783 18.93  -2.719  0.0348
A - B       -0.07142857 1.012400 18.93  -0.071  0.9973

phase = pre:
contrast       estimate       SE    df t.ratio p.value
control - A -0.80000000 1.083531 18.93  -0.738  0.7441
control - B  0.05714286 0.945783 18.93   0.060  0.9980
A - B        0.85714286 1.012400 18.93   0.847  0.6794

P value adjustment: tukey method for comparing a family of 3 estimates 
#adjusting for all 9 tests
summary(comps,by=NULL,adjust="holm")
 contrast    phase    estimate       SE    df t.ratio p.value
control - A fup   -2.85000000 1.083531 18.93  -2.630  0.1156
control - B fup   -2.88571429 0.945783 18.93  -3.051  0.0593
A - B       fup   -0.03571429 1.012400 18.93  -0.035  1.0000
control - A post  -2.50000000 1.083531 18.93  -2.307  0.1951
control - B post  -2.57142857 0.945783 18.93  -2.719  0.1092
A - B       post  -0.07142857 1.012400 18.93  -0.071  1.0000
control - A pre   -0.80000000 1.083531 18.93  -0.738  1.0000
control - B pre    0.05714286 0.945783 18.93   0.060  1.0000
A - B       pre    0.85714286 1.012400 18.93   0.847  1.0000

P value adjustment: holm method for 9 tests 

Polynomial contrasts

ref_poly <- lsmeans(fit_all,specs = c("phase"))
NOTE: Results may be misleading due to involvement in interactions
contrast(ref_poly,method="poly")
 contrast    estimate        SE df t.ratio p.value
Results are averaged over the levels of: treatment