spurious variable

(noun)

a mathematical relationship in which two events or variables have no direct causal connection, yet it may be wrongly inferred that they do, due to either coincidence or the presence of a certain third, unseen factor (referred to as a "confounding factor" or "lurking variable")

Related Terms

  • collinearity
  • Multicollinearity

Examples of spurious variable in the following topics:

  • Some Pitfalls: Estimability, Multicollinearity, and Extrapolation

    • A key issue seldom considered in depth is that of choice of explanatory variables.
    • There are several examples of fairly silly proxy variables in research - for example, using habitat variables to "describe" badger densities.
    • In a study on factors affecting unfriendliness/aggression in pet dogs, the fact that their chosen explanatory variables explained a mere 7% of the variability should have prompted the authors to consider other variables, such as the behavioral characteristics of the owners.
    • Despite the fact that automated stepwise procedures for fitting multiple regression were discredited years ago, they are still widely used and continue to produce overfitted models containing various spurious variables.
    • Examine how the improper choice of explanatory variables, the presence of multicollinearity between variables, and extrapolation of poor quality can negatively effect the results of a multiple linear regression.
  • Experimental Design

    • Other variables, which may not be readily obvious, may interfere with the experimental design.
    • To control for nuisance variables, researchers institute control checks as additional measures.
    • One of the most important requirements of experimental research designs is the necessity of eliminating the effects of spurious, intervening, and antecedent variables.
    • $Z$ is said to be a spurious variable and must be controlled for.
    • The same is true for intervening variables (a variable in between the supposed cause ($X$) and the effect ($Y$)), and anteceding variables (a variable prior to the supposed cause ($X$) that is the true cause).
  • Stepwise Regression

    • Forward selection involves starting with no variables in the model, testing the addition of each variable using a chosen model comparison criterion, adding the variable (if any) that improves the model the most, and repeating this process until none improves the model.
    • Backward elimination involves starting with all candidate variables, testing the deletion of each variable using a chosen model comparison criterion, deleting the variable (if any) that improves the model the most by being deleted, and repeating this process until no further improvement is possible.
    • This problem can be mitigated if the criterion for adding (or deleting) a variable is stiff enough.
    • The key line in the sand is at what can be thought of as the Bonferroni point: namely how significant the best spurious variable should be based on chance alone.
    • Unfortunately, this means that many variables which actually carry signal will not be included.
  • Confounding

    • A confounding variable is an extraneous variable in a statistical model that correlates with both the dependent variable and the independent variable.
    • A confounding variable is an extraneous variable in a statistical model that correlates (positively or negatively) with both the dependent variable and the independent variable.
    • A perceived relationship between an independent variable and a dependent variable that has been misestimated due to the failure to account for a confounding factor is termed a spurious relationship, and the presence of misestimation for this reason is termed omitted-variable bias.
    • However, a more likely explanation is that the relationship between ice cream consumption and drowning is spurious and that a third, confounding, variable (the season) influences both variables: during the summer, warmer temperatures lead to increased ice cream consumption as well as more people swimming and, thus, more drowning deaths.
    • Break down why confounding variables may lead to bias and spurious relationships and what can be done to avoid these phenomenons.
  • Explanatory and response variables

    • If we suspect poverty might affect spending in a county, then poverty is the explanatory variable and federal spending is the response variable in the relationship.
    • Sometimes the explanatory variable is called the independent variable and the response variable is called the dependent variable.
    • If there are many variables, it may be possible to consider a number of them as explanatory variables.
    • The explanatory variable might affect response variable.
    • In some cases, there is no explanatory or response variable.
  • Variables

    • In this case, the variable is "type of antidepressant. " When a variable is manipulated by an experimenter, it is called an independent variable.
    • An important distinction between variables is between qualitative variables and quantitative variables.
    • Qualitative variables are sometimes referred to as categorical variables.
    • Quantitative variables are those variables that are measured in terms of numbers.
    • The variable "type of supplement" is a qualitative variable; there is nothing quantitative about it.
  • Types of Variables

    • Numeric variables have values that describe a measurable quantity as a number, like "how many" or "how much. " Therefore, numeric variables are quantitative variables.
    • A continuous variable is a numeric variable.
    • A discrete variable is a numeric variable.
    • An ordinal variable is a categorical variable.
    • A nominal variable is a categorical variable.
  • Qualitative Variable Models

    • Dummy, or qualitative variables, often act as independent variables in regression and affect the results of the dependent variables.
    • Dummy variables are "proxy" variables, or numeric stand-ins for qualitative facts in a regression model.
    • In regression analysis, the dependent variables may be influenced not only by quantitative variables (income, output, prices, etc.), but also by qualitative variables (gender, religion, geographic region, etc.).
    • One type of ANOVA model, applicable when dealing with qualitative variables, is a regression model in which the dependent variable is quantitative in nature but all the explanatory variables are dummies (qualitative in nature).
    • Break down the method of inserting a dummy variable into a regression analysis in order to compensate for the effects of a qualitative variable.
  • Types of variables

    • This variable seems to be a hybrid: it is a categorical variable but the levels have a natural ordering.
    • A variable with these properties is called an ordinal variable.
    • To simplify analyses, any ordinal variables in this book will be treated as categorical variables.
    • Are these numerical or categorical variables?
    • Thus, each is categorical variables.
  • Slope and Intercept

    • The general purpose is to explain how one variable, the dependent variable, is systematically related to the values of one or more independent variables.
    • The coefficients are numeric constants by which variable values in the equation are multiplied or which are added to a variable value to determine the unknown.
    • Here, by convention, $x$ and $y$ are the variables of interest in our data, with $y$ the unknown or dependent variable and $x$ the known or independent variable.
    • Linear regression is an approach to modeling the relationship between a scalar dependent variable $y$ and one or more explanatory (independent) variables denoted $X$.
    • (This term should be distinguished from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable).
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.