Application Of F Distribution
“There are three types of lies — lies, damn lies, and statistics.” ― Benjamin Disraeli
In the estimation theory, we draw samples from the population and then estimate the population parameters on the basis of these sample values. In the testing procedure, we test whether these estimates are of required precision or not. Here in this article, we will discuss the application of F Distribution.
So many of the times, we come across the situations when we are supposed to check the two or more samples drawn from the same or different population having the same or different variability. F Distribution thus has application.
- The F Distribution is used by a researcher in order to carry out the test for the equality of the two population variances. If a researcher wants to test whether or not two independent samples have been drawn from a normal population with the same variability, then he generally employs the F-test. ANOVA is the best example depicting the use of F-test for comparing the variance ratio in which we find ratio F = variation between sample means/variation within the samples.An example depicting the above case in which the F-test is applied is, for example, if two sets of pumpkins are grown under two different experimental conditions. In this case, the researcher would select a random sample of size 9 and 11.The standard deviations of their weights are 0.6 and 0.8 respectively. After making an assumption that the distribution of their weights is normal, the researcher conducts an F-test to test the hypothesis on whether or not the true variances are equal.
Now the solution for this is:
We want to test H0: σ2x = σ2y
Against H1: σ2x ≠ σ2y
Here n1= 11 n2 =9 sx=0.8 sy = 0.6
Under H0, F= Sx2/ Sy2 follows F(n1,n2)
Sx2 = (n1/n1+n2) sx2 =0.704
Sy2= (n1/n1+n2) sy2 =0.28125
Fcal= 0.704/0.28125 = 2.5
Ftab (0.05) = 3.35 at (10,8) d.f
So Fcal < Ftab at 5% level of significance.
H0 may not be rejected, hence we can say that true variance is not equal. Or the samples of the pumpkins have come from the population having different variability with 95% confidence.
R123456789> Sxs=0.704> Sys=0.28125> fcal=Sxs/Sys> fcal 2.503111> qf(0.95,df1=10,df2=8) 3.347163
Clearly, fcal < qf , H0 is not rejected, i.e. the samples have come from populations having different variability.
A general program using the F-test functions in R:
R123456789101112131415> x=c(2.3,1,4.5,4,7,8,6.8,2,9,3.7)> y=c(1.2,4.5,5,6.7,8.2,5,3,2.2,0.6,1,2)> var.test(x,y,alternative="two.sided")F test to compare two variancesdata: x and yF = 1.206, num df = 9, denom df = 10, p-value = 0.7699alternative hypothesis: true ratio of variances is not equal to 195 percent confidence interval:0.319129 4.780328sample estimates:ratio of variances1.205976
- The F-test has also application in regression analysis. For a multiple regression model with intercept, we want to test the following null hypothesis and alternative hypothesis:
H0: β1 = β2 = … = βp-1 = 0
H1: βj ≠ 0, for at least one value of jThis test is known as the overall F-test for regression.An example can be stated as: For a multiple regression model with 35 observations and 9 independent variables (10 parameters), SSE(error) = 134 and SSM(model) = 289, test the null hypothesis that all of the regression parameters are zero at the 0.05 level.Solution: DFE(error) = n – p = 35 – 10 = 25 and DFM(model) = p – 1 = 10 – 1 = 9. ( DF means degree of freedom)
State the null and alternative hypothesis:
H0: β1 = β2 = , … , = βp-1 = 0
H1: βj ≠ 0 for some jCompute the test statistic:
Fcal = MSM(model)/MSE(error) = (SSM/DFM) / (SSE/DFE)
= (289/9) / (134/25)
= 32.111 / 5.360
Ftab = 2.28 at 95% confidence interval and (9,25) d.f.
clearly, Fcal > Ftab, H0 is rejected. i.e. any of the regression of coefficient is non zero, or the study variable is has dependence on at least one of the independent variable.
- Another application of F test is the testing of significance of an observed R2 (observed multiple correlation coefficient) of a variate with k other variates in a random sample of size n from (k+1) variate populationThen H0 : ρ2 = 0 (population multiple correlation coefficient) =0Against H1: ρ2 = 0 ≠0 Hence we found how the F Distribution is being helpful in various fields like ANOVA, regression model etc. If we look for the nature of the F-test then it is a parametric test that helps the researcher draw out an inference about the data that is drawn from a particular population. The F-test is called a parametric test because of the presence of parameters in the F- test.