Statistics
Textbooks
Boundless Statistics
Correlation and Regression
Multiple Regression
Statistics Textbooks Boundless Statistics Correlation and Regression Multiple Regression
Statistics Textbooks Boundless Statistics Correlation and Regression
Statistics Textbooks Boundless Statistics
Statistics Textbooks
Statistics
Concept Version 6
Created by Boundless

Models with Both Quantitative and Qualitative Variables

A regression model that contains a mixture of quantitative and qualitative variables is called an Analysis of Covariance (ANCOVA) model.

Learning Objective

  • Demonstrate how to conduct an Analysis of Covariance, its assumptions, and its use in regression models containing a mixture of quantitative and qualitative variables.


Key Points

    • ANCOVA is a general linear model which blends ANOVA and regression. It evaluates whether population means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV), while statistically controlling for the effects of covariates (CV).
    • ANCOVA can be used to increase statistical power and to adjust for preexisting differences in nonequivalent (intact) groups.
    • There are five assumptions that underlie the use of ANCOVA and affect interpretation of the results: normality of residuals, homogeneity of variances, homogeneity of regression slopes, linearity of regression, and independence of error terms.
    • When conducting ANCOVA, one should: test multicollinearity, test the homogeneity of variance assumption, test the homogeneity of regression slopes assumption, run ANCOVA analysis, and run follow-up analyses.

Terms

  • ANCOVA model

    Analysis of covariance; a general linear model which blends ANOVA and regression; evaluates whether population means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV), while statistically controlling for the effects of other continuous variables that are not of primary interest, known as covariates.

  • ANOVA Model

    Analysis of variance; used to analyze the differences between group means and their associated procedures (such as "variation" among and between groups), in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation.

  • covariance

    A measure of how much two random variables change together.

  • concomitant

    Happening at the same time as something else, especially because one thing is related to or causes the other (i.e., concurrent).


Full Text

A regression model that contains a mixture of both quantitative and qualitative variables is called an Analysis of Covariance (ANCOVA) model. ANCOVA models are extensions of ANOVA models. They are the statistic control for the effects of quantitative explanatory variables (also called covariates or control variables).

Covariance is a measure of how much two variables change together and how strong the relationship is between them. Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression. ANCOVA evaluates whether population means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV), while statistically controlling for the effects of other continuous variables that are not of primary interest, known as covariates (CV). Therefore, when performing ANCOVA, we are adjusting the DV means to what they would be if all groups were equal on the CV.

Uses of ANCOVA

ANCOVA can be used to increase statistical power (the ability to find a significant difference between groups when one exists) by reducing the within-group error variance.

ANCOVA can also be used to adjust for preexisting differences in nonequivalent (intact) groups. This controversial application aims at correcting for initial group differences (prior to group assignment) that exists on DV among several intact groups. In this situation, participants cannot be made equal through random assignment, so CVs are used to adjust scores and make participants more similar than without the CV. However, even with the use of covariates, there are no statistical techniques that can equate unequal groups. Furthermore, the CV may be so intimately related to the IV that removing the variance on the DV associated with the CV would remove considerable variance on the DV, rendering the results meaningless.

Assumptions of ANCOVA

There are five assumptions that underlie the use of ANCOVA and affect interpretation of the results:

  1. Normality of Residuals. The residuals (error terms) should be normally distributed.
  2. Homogeneity of Variances. The error variances should be equal for different treatment classes.
  3. Homogeneity of Regression Slopes. The slopes of the different regression lines should be equal.
  4. Linearity of Regression. The regression relationship between the dependent variable and concomitant variables must be linear.
  5. Independence of Error Terms. The error terms should be uncorrelated.

Conducting an ANCOVA

  • Test Multicollinearity. If a CV is highly related to another CV (at a correlation of .5 or more), then it will not adjust the DV over and above the other CV. One or the other should be removed since they are statistically redundant.
  • Test the Homogeneity of Variance Assumption. This is most important after adjustments have been made, but if you have it before adjustment you are likely to have it afterwards.
  • Test the Homogeneity of Regression Slopes Assumption. To see if the CV significantly interacts with the IV, run an ANCOVA model including both the IV and the CVxIV interaction term. If the CVxIV interaction is significant, ANCOVA should not be performed. Instead, consider using a moderated regression analysis, treating the CV and its interaction as another IV. Alternatively, one could use mediation analyses to determine if the CV accounts for the IV's effect on the DV.
  • Run ANCOVA Analysis. If the CVxIV interaction is not significant, rerun the ANCOVA without the CVxIV interaction term. In this analysis, you need to use the adjusted means and adjusted MSerror. The adjusted means refer to the group means after controlling for the influence of the CV on the DV.
  • Follow-up Analyses. If there was a significant main effect, there is a significant difference between the levels of one IV, ignoring all other factors. To find exactly which levels differ significantly from one another, one can use the same follow-up tests as for the ANOVA. If there are two or more IVs, there may be a significant interaction, so that the effect of one IV on the DV changes depending on the level of another factor. One can investigate the simple main effects using the same methods as in a factorial ANOVA.

ANCOVA Model

Graph showing the regression results of an ANCOVA model example: Public school teacher's salary (Y) in relation to state expenditure per pupil on public schools.

[ edit ]
Edit this content
Prev Concept
Qualitative Variable Models
Comparing Nested Models
Next Concept
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.