# How F-tests works in Analysis of Variance (ANOVA)

Analysis of variance technique was first introduced by R. A. Fisher. Though the name ANOVA suggests splitting of total variance into different components, actually it splits total sum of squares obtained from a dataset on a certain response variable into different sum of squares according to various sources of variations.

F-statistic is a ratio of two independent Chi-square variables. In one-way ANOVA technique, we want to compare the mean effects of several univariate and homoscedastic normal populations. As mentioned above we will split the total sum of squares for performing a test H_{0}: All means are equal __vs__ H_{1}: at least one inequality in H_{0}. As a test statistics we get an F-statistic and if observed F-value “<” or “>” tabulated F-value we can accept or reject H_{0}, respectively.

In one-way ANOVA F-statistic is given by,

Let take an example to understand what this is. Here we are given a one-way classified data from the synthetic veneer experiment.

Now we can calculate between groups variance and within group variance and taking ratio we get an observed value of F and then we compare it with tabulated F-value.

Let me show you the ANOVA table using MS Excel.

Anova: Single Factor

In the above table F means observed value of F-statistic and F crit means tabulated value of F-statistic. So, here F > F crit and we reject (can’t accept) the null hypothesis at 5% level of significance.

Now, let me show you how we can perform ANOVA in R programming and environment.

As the data is small we will enter the data manually, otherwise there are many functions to read big data. Here is the R-scripts.

1 2 3 4 5 6 |
<span style="color: #000000;">value <- c(2.3,2.1,2.4,2.5,2.2,2.0,1.9,2.1,2.2,2.3,2.4,2.6,2.4,2.7,2.6,2.7,2.3,2.5,2.3,2.4) brand <- rep(c("ACME","AJAX","CHAMP","TUFFY","XTRA"),each=4) data <- data.frame(brand,value) summary(aov(value ~ brand , data = data)) qf(0.95,4,15)</span> |

Output will be as follows,

1 2 3 4 5 6 7 8 9 10 11 12 13 |
<span style="color: #000000;">value <- c(2.3,2.1,2.4,2.5,2.2,2.0,1.9,2.1,2.2,2.3,2.4,2.6,2.4,2.7,2.6,2.7,2.3,2.5,2.3,2.4) brand <- rep(c("ACME","AJAX","CHAMP","TUFFY","XTRA"),each=4) data <- data.frame(brand,value) summary(aov(value ~ brand , data = data)) Df Sum Sq Mean Sq F value Pr(>F) brand 4 0.6170 0.15425 7.404 0.00168 ** Residuals 15 0.3125 0.02083 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > qf(0.95,4,15) [1] 3.055568 </span> |

So, from the above we can see the observed value of the F-statistic is 7.404 and the value of F-crit is 3.055568. As a conclusion, we reject (can’t accept) the null hypothesis at 5% level of significance.

So, this is how F-statistic is used in ANOVA technique.