Statistics
Textbooks
Boundless Statistics
Other Hypothesis Tests
The Chi-Squared Test
Statistics Textbooks Boundless Statistics Other Hypothesis Tests The Chi-Squared Test
Statistics Textbooks Boundless Statistics Other Hypothesis Tests
Statistics Textbooks Boundless Statistics
Statistics Textbooks
Statistics
Concept Version 7
Created by Boundless

Inferences of Correlation and Regression

The chi-square test of association allows us to evaluate associations (or correlations) between categorical data.

Learning Objective

  • Calculate the adjusted standardized residuals for a chi-square test


Key Points

    • The chi-square test indicates whether there is an association between two categorical variables, but unlike the correlation coefficient between two quantitative variables, it does not in itself give an indication of the strength of the association.
    • In order to describe the association more fully, it is necessary to identify the cells that have large differences between the observed and expected frequencies. These differences are referred to as residuals, and they can be standardized and adjusted to follow a Normal distribution.
    • The larger the absolute value of the residual, the larger the difference between the observed and expected frequencies, and therefore the more significant the association between the two variables.

Terms

  • correlation coefficient

    Any of the several measures indicating the strength and direction of a linear relationship between two random variables.

  • residuals

    The difference between the observed value and the estimated function value.


Full Text

The chi-square test of association allows us to evaluate associations (or correlations) between categorical data. It indicates whether there is an association between two categorical variables, but unlike the correlation coefficient between two quantitative variables, it does not in itself give an indication of the strength of the association.

nIn order to describe the association more fully, it is necessary to identify the cells that have large differences between the observed and expected frequencies. These differences are referred to as residuals, and they can be standardized and adjusted to follow a normal distribution with mean $0$ and standard deviation $1$. The adjusted standardized residuals, $d_{ij}$, are given by:

$\displaystyle{d_{ij}=\dfrac{O_{ij}-E_{ij}}{\sqrt{E_{ij\left ( 1-\dfrac{n_{i}}{N} \right )\left(1-\dfrac{n_{j}}{N}\right)}}}}$

where $n_i$ is the total frequency for row $i$, $n_j$ is the total frequency for column $j$, and $N$ is the overall total frequency. The larger the absolute value of the residual, the larger the difference between the observed and expected frequencies, and therefore the more significant the association between the two variables.

Table 1

Numbers of patients classified by site of central venous cannula and infectious complication. This table shows the proportions of patients in the sample with cannulae sited at the internal jugular, subclavian and femoral veins. Using the above formula to find the adjusted standardized residual for those with cannulae sited at the internal jugular and no infectious complications yields: $\frac{686-714.5}{\sqrt{714.5\left ( 1-\frac{934}{1706} \right )(1-\frac{1305}{1706})}}=-3.3$. Subclavian site/no infectious complication has the largest residual at 6.2. Because it is positive, there are more individuals than expected with no infectious complications where the subclavian central line site was used. As these residuals follow a Normal distribution with mean 0 and standard deviation 1, all absolute values over 2 are significant. The association between femoral site/no infectious complication is also significant, but because the residual is negative, there are fewer individuals than expected in this cell. When the subclavian central line site was used, infectious complications appear to be less likely than when the other two sites were used.

Table 2

The adjusted standardized residuals from Table 1.

[ edit ]
Edit this content
Prev Concept
Goodness of Fit
Example: Test for Goodness of Fit
Next Concept
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.