Statistics
Textbooks
Boundless Statistics
Correlation and Regression
Multiple Regression
Statistics Textbooks Boundless Statistics Correlation and Regression Multiple Regression
Statistics Textbooks Boundless Statistics Correlation and Regression
Statistics Textbooks Boundless Statistics
Statistics Textbooks
Statistics
Concept Version 7
Created by Boundless

Qualitative Variable Models

Dummy, or qualitative variables, often act as independent variables in regression and affect the results of the dependent variables.

Learning Objective

  • Break down the method of inserting a dummy variable into a regression analysis in order to compensate for the effects of a qualitative variable.


Key Points

    • In regression analysis, the dependent variables may be influenced not only by quantitative variables (income, output, prices, etc.), but also by qualitative variables (gender, religion, geographic region, etc.).
    • A dummy variable (also known as a categorical variable, or qualitative variable) is one that takes the value 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome.
    • One type of ANOVA model, applicable when dealing with qualitative variables, is a regression model in which the dependent variable is quantitative in nature but all the explanatory variables are dummies (qualitative in nature).
    • Qualitative regressors, or dummies, can have interaction effects between each other, and these interactions can be depicted in the regression model.

Terms

  • qualitative variable

    Also known as categorical variable; has no natural sense of ordering; takes on names or labels.

  • ANOVA Model

    Analysis of variance model; used to analyze the differences between group means and their associated procedures in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation.


Full Text

In statistics, particularly in regression analysis, a dummy variable (also known as a categorical variable, or qualitative variable) is one that takes the value 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome. Dummy variables are used as devices to sort data into mutually exclusive categories (such smoker/non-smoker, etc.).

Dummy variables are "proxy" variables, or numeric stand-ins for qualitative facts in a regression model. In regression analysis, the dependent variables may be influenced not only by quantitative variables (income, output, prices, etc.), but also by qualitative variables (gender, religion, geographic region, etc.). A dummy independent variable (also called a dummy explanatory variable), which for some observation has a value of 0 will cause that variable's coefficient to have no role in influencing the dependent variable, while when the dummy takes on a value 1 its coefficient acts to alter the intercept.

For example, if gender is one of the qualitative variables relevant to a regression, then the categories included under the gender variable would be female and male. If female is arbitrarily assigned the value of 1, then male would get the value 0. The intercept (the value of the dependent variable if all other explanatory variables hypothetically took on the value zero) would be the constant term for males but would be the constant term plus the coefficient of the gender dummy in the case of females.

ANOVA Models

Analysis of variance (ANOVA) models are a collection of statistical models used to analyze the differences between group means and their associated procedures (such as "variation" among and between groups). One type of ANOVA model, applicable when dealing with qualitative variables, is a regression model in which the dependent variable is quantitative in nature but all the explanatory variables are dummies (qualitative in nature).

This type of ANOVA modelcan have differing numbers of qualitative variables. An example with one qualitative variable might be if we wanted to run a regression to find out if the average annual salary of public school teachers differs among three geographical regions in a country . An example with two qualitative variables might be if hourly wages were explained in terms of the qualitative variables marital status (married / unmarried) and geographical region (North / non-North).

ANOVA Model

Graph showing the regression results of the ANOVA model example: Average annual salaries of public school teachers in 3 regions of a country.

Qualitative regressors, or dummies, can have interaction effects between each other, and these interactions can be depicted in the regression model. For example, in a regression involving determination of wages, if two qualitative variables are considered, namely, gender and marital status, there could be an interaction between marital status and gender.

[ edit ]
Edit this content
Prev Concept
Polynomial Regression
Models with Both Quantitative and Qualitative Variables
Next Concept
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.