Implementation of Statistical Hypothesis Testing in R

What is the statistical hypothesis?
A statistical hypothesis is an assertion or conjecture about the distribution of one or more random variables. If the hypothesis completely specifies the distribution, then it is called a simple hypothesis otherwise it is called the composite hypothesis.

What is testing?
Testing is a procedure or rule to decide whether to reject the hypothesis or not. We will discuss some tests (z-test, F-test, t-test, chi-square test) and their implementation into R.

Z-test:

Suppose we have n random X1, X2,…, Xn samples from a normal distribution with mean μ and variance σ2(which is specified/known). Here our hypothesis is about the mean μ of the normal population. We can have 3 types of test,

(i)                  H0: μ = μ0 vs H1: μ ≠ μ0

(ii)                H0: μ = μ0 vs H1: μ > μ0

(iii)              H0: μ = μ0 vs H1: μ < μ0

(i) H0: μ = μ0 vs H1: μ ≠ μ

In this case, we have both tailed test. The test statistic we use here is,

which follows under the null hypothesis N(0,1). We reject the null hypothesis at level α if the absolute value of the observed Z statistic is greater than the Z1-α/2, the upper α/2th point of N(0,1).

Example:

Suppose we have 10 random samples from a normal population having variance 25.

3.27, 2.53, 2.98, 4.11, 3.35, 3.35, 0.38, 4.93, 3.97, 3.17

We want to test whether the mean is 3 or not. Here our hypothesis is H0: μ = 5 vs H1: μ ≠ 5. We have the R codes to perform the test as

CODE

Output

So, we accept (can’t reject) the statement that the mean of the population is 5 at 5% level of significance.

(ii) H0: μ = μ0 vs H1: μ > μ0

[Note: The above hypothesis is equivalent to test, H0: μ ≤ μ0 vs H1: μ > μ0This is a right-tailed test where the test statistic is as same as above,

which follows under the null hypothesis N(0,1). We reject the null hypothesis at level α if the value of the observed Z statistic is greater than the Z1-α, the upper αth point of N(0,1).

Example:

Consider the same example as before, we have 10 random samples from a normal population having variance 25.

3.27, 2.53, 2.98, 4.11, 3.35, 3.35, 0.38, 4.93, 3.97, 3.17

Our objective is to test whether the mean is greater than 5 or not. Here our hypothesis is H0: μ = 5 vs H1: μ > 5.

We have the R codes to perform the test as,

CODE:

OUTPUT:

So, we can’t agree that the mean of the population is greater than 5 at 5% level of significance.

(iii) H0: μ = μ0 vs H1: μ < μ0

[Note: The above hypothesis is equivalent to test, H0: μ ≥ μ0 vs H1: μ < μ0This is a left tailed test where the test statistic is as same as above, i.e.

which follows under the null hypothesis N(0,1). We reject the null hypothesis at level α if the value of the observed Z statistic is smaller than the Zα, the lower αth point of N(0,1).

Example:

Consider the same example. We have 10 random samples from a normal population having variance 25.

3.27, 2.53, 2.98, 4.11, 3.35, 3.35, 0.38, 4.93, 3.97, 3.17

We want to test whether the mean is less than 5 or not. Here our hypothesis is H0: μ = 5 vs H1: μ < 5. We have the R codes to perform the test as,

CODE:

OUTPUT:

So, we can’t say the population mean is less than 5 at 5% level of significance.

These are the one sample z-test. We can have some situations where we have to check the difference between two normal populations with known variances. Let two populations are N(μ112) and N(μ222). We have random samples from these two populations as X1, X2,…, Xn1 and Y1, Y2,…, Yn2. Here σ12, σ22 are known quantity.

Our interest is to test, H0: μ1 – μ2 = μ0 vs H1: μ1 – μ2 ≠ μ0, where μ0 is a real constant.

This is equivalent to test, H0: μ1 – μ2 – μ0 = 0 vs H1: μ1 – μ2 – μ0 ≠ 0

Here the test statistic is,   

Which follows N(0,1) under the null hypothesis. We reject the null hypothesis at level α is the absolute value of the statistic is greater than  Z1-α/2, the upper α/2th point of N(0,1).

Example:

Suppose we have 8 random samples from N(μ1,25) and 10 random samples from N(μ2,9) as follows,

X : 14.63, 5.44, 6.08, 7.40,15.41, -0.82, 7.89, 1.04Y : 8.26, 7.79, 7.98, 8.91,14.87, 6.04, 7.82, 4.52, 12.71, 10.51

We want to test whether the means of two populations are equal or not.

Here our hypothesis will be, H0: μ1 – μ2 = 0 vs H1: μ1 – μ2 ≠ 0

We have the R codes to perform the test as,

CODE:

OUTPUT:

So, here the means do not differ significantly at 5% level of significance.

Similarly, we can test H0: μ1 – μ2 = 0 vs H1: μ1 – μ2 > 0  or     H0: μ1 – μ2 = 0 vs H1: μ1 – μ2 < 0.

Here the statistic remains the same but the rejection region will be changed as discussed in one-sample z-test.

t-test:

We have discussed the tests for the mean of normal populations with known variance. But in most of the real-life situations, we have unknown variances. So, in this case, we have a t-test for testing of the mean of normal populations with unknown variances. Basically, here we just put the estimate of variance and carry out the test.

One sample t-test

The test statistic we use,

which follows t-distribution with n-1 degrees of freedom under the null hypothesis. Here s is sample variance with divisor n-1. Let us discuss this with an example. Suppose we have a sample of 30 observations on IQ scores of a class.

114, 104, 89, 118, 105, 90, 113, 90, 108, 116, 116, 106, 92, 105, 94, 100, 95, 97, 89, 97, 90, 124, 100, 98, 76, 106, 113, 86, 75, 102

Assuming a normal distribution, one may want to test if the mean IQ score is 95. Using R we can easily perform the hypothesis testing, H0: μ = 95 vs H1: μ ≠ 95.

CODE:

OUTPUT:

Here the p-value is less than 0.05, so we can reject the null hypothesis at 5% level of significance. The IQ score on an average is not 95. Similarly, we can have the right/left tail test by changing the alternative as “greater”/”less”.

Two sample t-test

Now suppose we have two independent groups and we want to compare the means. Here we will use the two-sample t-test. The test statistic we use here is,

where,  are the sample means and,   are the sample variances of the two groups. The statistic follows t distribution with n1+n2-2 degrees of freedom under the null hypothesis. Suppose we have a score on some males and females as follows,

Female: 95, 78, 68, 95, 98, 79, 98, 86, 78, 89, 89, 94

Male: 100, 100, 95, 90, 95, 98, 100, 100

Here someone may test whether the means are the same or not. So, it is both tailed t-tests. Here the hypothesis is, H0: μ1 – μ2 = 0 vs H1: μ1 – μ2 ≠ 0 with unequal variances. We can perform the test using R and get the result as follows.

Female: 95, 78, 68, 95, 98, 79, 98, 86, 78, 89, 89, 94

Male: 100, 100, 95, 90, 95, 98, 100, 100

Here someone may test whether the means are the same or not. So, it is both tailed t-tests. Here the hypothesis is, H0: μ1 – μ2 = 0 vs H1: μ1 – μ2 ≠ 0 with unequal variances. We can perform the test using R and get the result as follows.

CODE:

OUTPUT:

Here we have the p-value less than 0.05, so we reject the null hypothesis at 5% level of significance. We can conclude that the means significantly differ.

Similarly, we can have the comparison that the difference is c (some constant). Then we replace the value of “mu” by c. We can have a left/right-tailed test by changing the “alternative” as “less”/”greater”. If the two variables have common variance, then we put the logical value of “var.equal” as “TRUE”. [NOTE: When the variances are equal, are replaced by s2, combined sample variance with divisor n1+n2-2]

Paired t-test

Here we also have two variables but they are related (i.e. collected from the same group, person, item or thing). Here we basically take the differences and taking the differences as a variable we perform one sample t-test.

The hypothesis we test here is, H0: d = μd vs H1: d ≠ μd.

which follows under null hypothesis t distribution with n-1 degrees of freedom. Let us consider an example of marks in mathematics and statistics of 10 students.

Mathematics: 97, 86, 100, 73, 79, 93, 80, 96, 82, 65

Statistics: 95, 77, 72, 74, 85, 80, 92, 78, 79, 75

We want to test if there is any significant difference between those marks. Using R:

CODE:

OUTPUT:

Here the p-value is greater than 0.05, so we accept (can’t reject) the null hypothesis at 5% level of significance. We can conclude that there is no significant difference between the marks in mathematics and statistics. We similarly have the left/right-tailed test by changing the “alternative” as “less”/”greater”. We can have the comparison that the difference is c (some constant). Then we replace the value of “mu” by c.

Chi-squared test:

First, let me tell you the basic of the chi-squared variable. If a Xi follows independently N(0,1), then Xi2 follows χ2 distribution with df 1. Chi-squared has additive property, so follows χ2 distribution with df n. Now, where the null distribution of test statistic follows χ2 distribution then the test is called chi-squared test. This is the basic idea of the chi-squared (χ2) test.

Now, suppose we have n observations from N(μ02), where μ0 is a known constant. We want to test for the variance. Here it will be a chi-squared test as the test statistic,

which follows distribution under null distribution. Now if the mean is unknown then the above test statistic will be

, where S2 is the sample variance with the n-1 divisor. Here the test statistic follows  under the null hypothesis.

The main use of the chi-squared test is for checking independence between two categorical variables:-

Suppose we have a 2 X 2 contingency table showing new drug effects,

Improved Not improved
Not treated 26 29
Treated 35 15

 

We want to check if the treatment really improves.  Here our null hypothesis is “treatment and effects are independent” and the alternative is “treatment and effects are not independent”.

CODE:

OUTPUT:

Here the p-value is less than 0.05, so we reject the null hypothesis at 5% level of significance. Thus, we conclude that there is a significant relationship between the treatment and the effect of the drug.

F-test:

F-test is used to test the equality of two variances of normal populations. We have a function “var.test()” for comparing two variances. Suppose we have the observations from two groups,
A: 9.83, 9.50, 5.49, 10.45, 7.76, 15.11, -5.30, -2.50, -1.29, 15.11
B: 29.82, -2.65, 14.78, 1.09, 9.86, 8.30, 8.93, 12.04, 31.89, 2.48, 8.38, -2.59

CODE:

OUTPUT:

Here p-value of the test greater than 0.05, so we accept (can’t reject) the null hypothesis at 5% level of significance. Thus, we can say the variances do not significantly differ. 

F-test is also used in the ANOVA technique. In ANOVA, we test the equality of effects of many treatments. In this case, the test statistic become,

We have a function in R names “aov()” whose “summary()” will provide the testing details with the p-value. We can decide to reject/accept the null hypothesis depending on the p-value.

Similarly, for testing the significance of regression we perform F-test as same as ANOVA. 

These are how all the above tests work and it is quite easy to perform the tests in R as specified in this article. All the tests have the same assumption that the distribution of the variable has to follow a normal distribution. In real life, the number of observations we get in is too large to consider the normality assumption.

You might also like More from author