Lecture 3: ANOVA & ANCOVA (ARMS, Utrecht University)

ANOVA: comparing groups on a continuous variable. The analysis separates between group variation and within group variation.

But what if there are more factors that influence the outcome y? You need to control for them. You use an ANCOVA to do this. Example: you want to research whether there is a difference in cognitive abilities between babies of teen moms and adult moms. But, not only age, but also IQ influences the cognitive abilities. You need to control for IQ when conducting this experiment.

In an ANCOVA, you always need a categorical predictor (factor): we want to compare groups so there must be a variable creating groups.

The covariate is always continuous.

Homogeneity is an assumption for ANCOVA. Checking homogeneity:

1. When you draw a line through the clouds of the scatter plot, the lines of the two groups must be parallel. If not, there is no homogeneity of regression slopes.

2. You can also check homogeneity by looking at the significance of the interaction effect. If the effect is significant: there is no homogeneity. If the effect is not significant the assumption is met.

Do both 1&2!

There is another assumption (but researchers don’t all agree on how to interpret this) that states that there should be independence of the covariate and the factor:

  • ANCOVA can be used for groups that are randomized. Then any differences between the groups are chance differences, because they are random. So we all agree that if you see that random groups differ, you use covariates to control for these chance differences.
  • But they also conclude that ANCOVA should never be used on existing groups. So some people say that you should not use (using the example above) IQ as covariate because it is not independent of the factor, and that’s a violation of the assumption. But in practice we say: be careful with the interpretation of existing groups.

AN(C)OVA test statistic: F = MSgroup/MSresidual

Adding a covariate will change the F-test if:

  • Adjusted means change (this has an effect on MSgroup )
  • The covariate is (strongly) related to Y (this has an effect on MSresidual )

So: inclusion of a covariate can be useful when groups differ on the covariate, but also when they do not! But, to be useful, the covariate must be related to Y.

In a regression model, you have R2 for the explained variance. In AN(C)OVA we have a similar measure, called eta-squared. But SPSS provides the partial eta-squared. This is still the amount of variance explained, but it’s not divided by the total variation but by the sum of the variance of this effect and the residuals. So it answers: how much variance is explained compared to the part that’s not explained (since there are more effects in one model).

The AN(C)OVA tells you whether or not there is a difference between the groups. If you find a difference, you need follow-up testing to know which group differ. In this case, you can use a post-hoc test, which tells you what groups differ significantly using pairwise comparisons. A post-hoc test uses correction to protect against inflated type 1 errors.

There is another approach, called planned comparisons. This uses specific contrast-tests. Before you do the analysis, you specify the specific groups you want to compare (for example if you expect groups to differ). It then only analyses the part you’re interested in, and does not (usually) do any alpha corrections.  

One correction that is often used in a post-hoc test is the Bonferroni correction. This multiplies the p-value by the amount of tests you run, so that the user can still compare the p-value that SPSS shows to the original alpha level.

Questions? Let me know in the contribution section!

Follow me for more summaries on statistics!

Contributions, Comments & Kudos


Hey Julia, thanks for the summary and explaining the difference between ANOVA and ANCOVA!! It really helped me understand the topics!!! I do have a quick question: Why is the covariate always continuous? 

Hi Roos! I'm glad the study

Hi Roos! I'm glad the study notes helped you understand the topics :) Regarding your question: I think the covariate has to be continous (interval or ratio level) because the slope has to represent actual ordered numbers to be able to identify the relationship between the covariate and the dependent variable. However, I found on internet that a covariate can be categorical as well, but then the analysis method you use is not an ANCOVA:


I hope this answers you question!

Add new contribution

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Enter the characters shown in the image.
Summaries & Study Note of JuliaV
Join World Supporter
Join World Supporter
Log in or create your free account

Why create an account?

  • Your WorldSupporter account gives you access to all functionalities of the platform
  • Once you are logged in, you can:
    • Save pages to your favorites
    • Give feedback or share contributions
    • participate in discussions
    • share your own contributions through the 11 WorldSupporter tools
Access level of this page
  • Public
  • WorldSupporters only
  • JoHo members
  • Private
16 2
Selected Categories