# Application Of F Distribution

“There are three types of lies — lies, damn lies, and statistics.”  ― Benjamin Disraeli

In the estimation theory, we draw samples from the population and then estimate the population parameters on the basis of these sample values. In the testing procedure, we test whether these estimates are of required precision or not. Here in this article, we will discuss the application of F Distribution.

So many of the times, we come across the situations when we are supposed to check the two or more samples drawn from the same or different population having the same or different variability. F Distribution thus has application.

1. The F Distribution is used by a researcher in order to carry out the test for the equality of the two population variances. If a researcher wants to test whether or not two independent samples have been drawn from a normal population with the same variability, then he generally employs the F-test. ANOVA is the best example depicting the use of F-test for comparing the variance ratio in which we find ratio F = variation between sample means/variation within the samples.An example depicting the above case in which the F-test is applied is, for example, if two sets of pumpkins are grown under two different experimental conditions. In this case, the researcher would select a random sample of size 9 and 11.The standard deviations of their weights are 0.6 and 0.8 respectively. After making an assumption that the distribution of their weights is normal, the researcher conducts an F-test to test the hypothesis on whether or not the true variances are equal.

Now the solution for this is:
We want to test H0: σ2x = σ2y

Against              H1: σ2x ≠ σ2y

Here n1= 11   n2 =9  sx=0.8   sy = 0.6

Under  H0,  F= Sx2/ Sy2 follows F(n1,n2)

Sx2 = (n1/n1+n2) sx2  =0.704

Sy2=  (n1/n1+n2) sy2 =0.28125

Fcal= 0.704/0.28125  =   2.5

Ftab (0.05) = 3.35 at (10,8) d.f

So  Fcal < Ftab at 5% level of significance.

H0 may not be rejected, hence we can say that true variance is not equal. Or the samples of the pumpkins have come from the population having different variability with 95% confidence.

R program

Clearly, fcal < qf , H0 is not rejected, i.e. the samples have come from populations having different variability.

A general program using the F-test functions in R:

2. The F-test has also application in regression analysis. For a multiple regression model with intercept, we want to test the following null hypothesis and alternative hypothesis:
H0:   β1 = β2 = … = βp-1 = 0
H1:   βj ≠ 0, for at least one value of jThis test is known as the overall F-test for regression.An example can be stated as: For a multiple regression model with 35 observations and 9 independent variables (10 parameters), SSE(error) = 134 and  SSM(model) = 289, test the null hypothesis that all of the regression parameters are zero at the 0.05 level.Solution: DFE(error) = n – p = 35 – 10 = 25 and DFM(model) = p – 1 = 10 – 1 = 9. ( DF means degree of freedom)
State the null and alternative hypothesis:
H0:   β1 = β2 = , … , = βp-1 = 0
H1:   βj ≠ 0 for some jCompute the test statistic:
Fcal  = MSM(model)/MSE(error)                                                                                                                             = (SSM/DFM) / (SSE/DFE)
= (289/9) / (134/25)
= 32.111 / 5.360
= 5.991

Ftab = 2.28 at 95% confidence interval  and (9,25) d.f.
clearly, Fcal > Ftab, H0 is rejected. i.e. any of the regression of coefficient is non zero, or the study variable is has dependence on at least one of the independent variable.

3. Another application of F test is the testing of significance of an observed R2 (observed multiple correlation coefficient)  of a variate with k other variates in a random sample of size n from (k+1) variate populationThen H0 : ρ2 = 0 (population multiple correlation coefficient) =0Against H1: ρ2 = 0 ≠0 Hence we found how the F Distribution is being helpful in various fields like ANOVA, regression model etc. If we look for the nature of the F-test then it is a parametric test that helps the researcher draw out an inference about the data that is drawn from a particular population. The F-test is called a parametric test because of the presence of parameters in the F- test.

For further studies and updates, latest updates or interview tips on data science and machine learning, subscribe to our emails.