Statistics
Textbooks
Boundless Statistics
Estimation and Hypothesis Testing
One-Way ANOVA
Statistics Textbooks Boundless Statistics Estimation and Hypothesis Testing One-Way ANOVA
Statistics Textbooks Boundless Statistics Estimation and Hypothesis Testing
Statistics Textbooks Boundless Statistics
Statistics Textbooks
Statistics
Concept Version 9
Created by Boundless

The One-Way F-Test

The FFF-test as a one-way analysis of variance assesses whether the expected values of a quantitative variable within groups differ from each other.

Learning Objective

  • Explain the purpose of the one-way ANOVA FFF-test and perform the necessary calculations.


Key Points

    • The advantage of the ANOVA FFF-test is that we do not need to pre-specify which treatments are to be compared, and we do not need to adjust for making multiple comparisons.
    • The disadvantage of the ANOVA FFF-test is that if we reject the null hypothesis, we do not know which treatments can be said to be significantly different from the others.
    • If the FFF-test is performed at level α\alphaα we cannot state that the treatment pair with the greatest mean difference is significantly different at level α\alphaα.
    • The FFF-statistic will be large if the between-group variability is large relative to the within-group variability, which is unlikely to happen if the population means of the groups all have the same value.

Terms

  • ANOVA

    Analysis of variance—a collection of statistical models used to analyze the differences between group means and their associated procedures (such as "variation" among and between groups).

  • F-Test

    a statistical test using the FFF distribution, most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled

  • omnibus

    containing multiple items


Full Text

The FFF test as a one-way analysis of variance is used to assess whether the expected values of a quantitative variable within several pre-defined groups differ from each other. For example, suppose that a medical trial compares four treatments. The ANOVA FFF-test can be used to assess whether any of the treatments is on average superior, or inferior, to the others versus the null hypothesis that all four treatments yield the same mean response. This is an example of an "omnibus" test, meaning that a single test is performed to detect any of several possible differences.

Alternatively, we could carry out pairwise tests among the treatments (for instance, in the medical trial example with four treatments we could carry out six tests among pairs of treatments). The advantage of the ANOVA FFF-test is that we do not need to pre-specify which treatments are to be compared, and we do not need to adjust for making multiple comparisons. The disadvantage of the ANOVA FFF-test is that if we reject the null hypothesis, we do not know which treatments can be said to be significantly different from the others. If the FFF-test is performed at level α\alphaα we cannot state that the treatment pair with the greatest mean difference is significantly different at level α\alphaα.

The formula for the one-way ANOVA FFF-test statistic is:

F=explained varianceunexplained varianceF=\dfrac { \text{explained variance} }{ \text{unexplained variance} }F=​unexplained variance​​explained variance​​

or

F=between-group variabilitywithin-group variabilityF=\dfrac { \text{between-group variability} }{ \text{within-group variability} }F=​within-group variability​​between-group variability​​

The "explained variance," or "between-group variability" is:

∑ini(Y¯i−Y¯)2(K−1)\displaystyle \sum _{ i }^{ } \frac{{ n }_{ i }{ \left( { \bar { Y } }_{ i }-\bar { Y } \right) }^{ 2 }}{\left( K-1 \right)}​i​∑​​​​(K−1)​​n​i​​(​Y​¯​​​i​​−​Y​¯​​)​2​​​​

where Y¯i{ \bar { Y } }_{ i }​Y​¯​​​i​​ denotes the sample mean in the iiith group, nin_in​i​​ is the number of observations in the iiith group, Y¯\bar { Y }​Y​¯​​ denotes the overall mean of the data, and KKK denotes the number of groups.

The "unexplained variance", or "within-group variability" is:

∑ij(Y¯ij−Y¯i)2(N−K)\displaystyle \sum _{ ij }^{ }\frac{{ \left( { \bar { Y } }_{ ij }-{ \bar { Y } }_{ i } \right) }^{ 2 }} {\left( N-K \right)}​ij​∑​​​​(N−K)​​(​Y​¯​​​ij​​−​Y​¯​​​i​​)​2​​​​

where Yij¯\bar{Y_{ij}}​Y​ij​​​¯​​ is the jjjth observation in the iiith out of KKK groups and NNN is the overall sample size. This FFF-statistic follows the FFF-distribution with K−1K-1K−1, N−KN-KN−K degrees of freedom under the null hypothesis. The statistic will be large if the between-group variability is large relative to the within-group variability, which is unlikely to happen if the population means of the groups all have the same value.

Note that when there are only two groups for the one-way ANOVA FFF-test, F=t2F=t^2F=t​2​​ where ttt is the Student's ttt-statistic.

Example

Four sororities took a random sample of sisters regarding their grade means for the past term. The data were distributed as follows:

  • Sorority 1: 2.17, 1.85, 2.83, 1.69, 3.33
  • Sorority 2: 2.63,1.77, 3.25, 1.86, 2.21
  • Sorority 3: 2.63, 3.78, 4.00, 2.55, 2.45
  • Sorority 4: 3.79, 3.45, 3.08, 2.26, 3.18

Using a significance level of 1%, is there a difference in mean grades among the sororities?

Solution

Let μ1\mu_1μ​1​​, μ2\mu_2μ​2​​, μ3\mu_3μ​3​​, μ4\mu_4μ​4​​ be the population means of the sororities. Remember that the null hypothesis claims that the sorority groups are from the same normal distribution. The alternate hypothesis says that at least two of the sorority groups come from populations with different normal distributions. Notice that the four sample sizes are each size 5. Also, note that this is an example of a balanced design, since each factor (i.e., sorority) has the same number of observations.

H0:μ1=μ2=μ3=μ4H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4H​0​​:μ​1​​=μ​2​​=μ​3​​=μ​4​​

Ha:H_a: H​a​​: Not all of the means μ1\mu_1μ​1​​, μ2\mu_2μ​2​​, μ3\mu_3μ​3​​, μ4\mu_4μ​4​​ are equal

Distribution for the test: F3,16F_{3, 16}F​3,16​​

where k=4k=4k=4 groups and n=20n=20n=20 samples in total

dfnumerator=k−1=4−1=3df_{\text{numerator}} = k-1 = 4-1 = 3df​numerator​​=k−1=4−1=3

dfdenominator=n−k=20−4=16df_{\text{denominator}} = n-k = 20-4 = 16df​denominator​​=n−k=20−4=16

Calculate the test statistic: F=2.23F=2.23F=2.23

Graph:

Graph of ppp-Value

This chart shows example p-values for two F-statistics: p = 0.05 for F = 3.68, and p = 0.00239 for F = 9.27. These numbers are evidence of the skewness of the F-curve to the right; a much higher F-value corresponds to an only slightly smaller p-value.

Probability statement: p-value=P(F>2.23)=0.1241p\text{-value} = P(F>2.23) = 0.1241p-value=P(F>2.23)=0.1241

Compare α\alphaα and the ppp-value: α=0.01\alpha = 0.01α=0.01, p-value=0.1241p\text{-value} = 0.1241p-value=0.1241

Make a decision: Since α<p-value\alpha < p\text{-value}α<p-value, you cannot reject H0H_0H​0​​.

Conclusion: There is not sufficient evidence to conclude that there is a difference among the mean grades for the sororities.

[ edit ]
Edit this content
Prev Concept
The F-Test
Variance Estimates
Next Concept
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.