# Non Parametric Tests and Its Application Using R

Before we start with Non-Parametric Tests, We know within a parametric framework at first, we assume an explicit functional form of the population distribution function which is labelled by a parameter ϴ where ϴ is unknown or not completely known and subsequently any specific feature of the population distribution is objectively expressed as a function of ϴ.

On the other hand, within a non-parametric framework, any assumption relating to the functional form of the population distribution function is not made. Here some very general assumptions are made such as the population distribution is absolutely continuous. In the non-parametric inferential problem, we do have parameters but they failed to label the population distribution.

Suppose we are given a sample to validate any conjecture about the population median and we don’t have any idea about the population distribution function (or maybe the structural form the distribution is too complicated to deal with). Here the median doesn’t label the distribution function of the population. In this case, we use **non-parametric tests**.

There are many situations where the non-parametric test is a good choice.

- If the sample size is too small to validate the assumptions (e.g. normality) of parametric tests, then non-parametric tests are the better option.
- If the data are given in form of ranks, scores, grades etc we can adopt non-parametric tests. Usually, non-parametric tests are based on fractiles, ranks, concordance, discordance etc.

Parametric tests cannot apply if the data is ordinal or nominal, but in non-parametric tests, there are no such assumptions.

An important feature of the non-parametric test is the test statistic we use here is distribution free i.e. the distribution of the test statistic does not depend on the population distribution. If there are both parametric and non-parametric tests exist then the formal is likely to be more powerful than the later.

Let me discuss some important non-parametric tests and their implementation into R.

**SIGN Test****:**

__One-sample__:

One sample sign test is used to check whether the population median is equal to some constant or not. Suppose we have a data on the reviews (1-10) of a film. Now we want to check whether the median is 7 or not. So, we have the data as,

6,8,9,4,7,6,9,7,6,9,7,4,3,6,4,6,7,5,4,8,6,9,7,4,6,5,4.

Now our null hypothesis is population median is 7 against the alternative “not 7”. Now to perform the test in R we have the following codes,

**CODE:**

1 2 3 4 |
library(BSDA) x <- c(6,8,9,4,7,6,9,7,6,9,7,4,3,6,4,6,7,5,4,8,6,9,7,4,6,5,4) SIGN.test(x, md = 7, alternative = "two.sided", conf.level = 0.95) |

**OUTPUT:**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
library(BSDA) x <- c(6,8,9,4,7,6,9,7,6,9,7,4,3,6,4,6,7,5,4,8,6,9,7,4,6,5,4) SIGN.test(x, md = 7, alternative = "two.sided", conf.level = 0.95) One-sample Sign-Test data: x s = 6, p-value = 0.05248 alternative hypothesis: true median is not equal to 7 95 percent confidence interval: 5 7 sample estimates: median of x 6 Achieved and Interpolated Confidence Intervals: Conf.Level L.E.pt U.E.pt Lower Achieved CI 0.9478 5 7 Interpolated CI 0.9500 5 7 Upper Achieved CI 0.9808 5 7 |

Here the p-value is greater than 0.05, so we can’t reject the null hypothesis.

__Two-sample__:

Let us consider an example to understand the two-sample sign test. Suppose a juice shop introduced two types of juice to the market. Now the owner wants to know whether people like both of the juice or not. He took random samples of size 16 of the customer’s reviews which one they prefer most. In the sample, he noticed that 6 peoples out of 16 preferred the first juice. Here null hypothesis is both the juices are equally effective. In that case, we should have the proportion of the preference of one juice is 0.5. So, actually, we are testing if the success probability of the binomial distribution (with n=16) is 0.5 or not.

Here we can perform the test in R as follows,

**CODE:**

1 2 |
binom.test(6, 16, alternative = "two.sided", p = 0.5, conf.level = 0.95) |

**OUTPUT:**

1 2 3 4 5 6 7 8 9 10 11 12 13 |
binom.test(6, 16, alternative = "two.sided", p = 0.5, conf.level = 0.95) Exact binomial test data: 6 and 16 number of successes = 6, number of trials = 16, p-value = 0.4545 alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.1519837 0.6456539 sample estimates: probability of success 0.375 |

Here we have the p-value 0.4545 > 0.05, so we can’t reject the null hypothesis. So, the owner can say at 5% level of significance there is not enough evidence to conclude that the juices are not equally preferred by people. From the output, we also get a 95% confidence interval. We can change the probability of success as we needed under the null by changing the value of “p” in the function “binom.test()”.

If there is numerical data on two varieties then we just subtract one from another. Then take the positive signs as success and count the success. Then we can have the binomial test as before.

The main disadvantage of the sign test is it doesn’t take into account the magnitude of the differences. So, we have to find another test which takes into account the magnitude of the differences.

Let us consider the data on the reviews of two schools from 15 individuals. On the basis of this reviews, one has to decide whether both schools are equally effective for his child.

**Wilcoxon Rank Sum test and Mann-Whitney test****:**

Suppose we want to test whether a distribution is of the right side or the left side of another distribution or both is on the same support. So here we want to compare the locations of the two distributions.

Consider the class test marks in the mathematics of a class. Now we want to know that if the boys got more marks than girls. Here don’t have any knowledge of the distributions of the marks of boys and girls. So, we will conduct a non-parametric test. To solve this query, we have Wilcoxon rank sum test.

Suppose the marks of the 15 boys are,

66, 71, 85, 87, 86, 70, 88, 61, 68, 79, 73, 69, 67, 82, 60

and marks of 10 girls are,

60, 51, 67, 68, 52, 65, 63, 84, 50, 62, 69, 73, 57, 74, 85

Our null hypothesis is the location shift is zero (i.e. both are on the same support) against the alternative the location shift is greater than zero (i.e. the boys’ marks greater than girls’ marks)

Here we can perform Wilcoxon rank sum test using R as follow,

**CODE:**

1 2 3 4 |
boy <- c(66, 71, 85, 87, 86, 70, 88, 61, 68, 79, 73, 69, 67, 82, 60) girl <- c(60, 51, 67, 68, 52, 65, 63, 84, 50, 62, 69, 73, 57, 74, 85) wilcox.test(boy, girl, alternative = "greater", mu = 0, paired = FALSE) |

**OUTPUT:**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
boy <- c(66, 71, 85, 87, 86, 70, 88, 61, 68, 79, 73, 69, 67, 82, 60) girl <- c(60, 51, 67, 68, 52, 65, 63, 84, 50, 62, 69, 73, 57, 74, 85) wilcox.test(boy, girl, alternative = "greater", mu = 0, paired = F) Wilcoxon rank sum test with continuity correction data: boy and girl W = 165, p-value = 0.01545 alternative hypothesis: true location shift is greater than 0 Warning message: In wilcox.test.default(boy, girl, alternative = "greater", mu = 0, : cannot compute exact p-value with ties |

Here we get the p-value less than 0.05 but greater than 0.01. So, one can say that the boys’ marks are greater than girls’ marks at 5% level of significance, but at 1% level of significance, we don’t have enough evidence to say that boys got more marks than girls.

We can visualize the result of the test by using r code as,

1 2 3 4 |
ggplot(data=data)+geom_histogram(aes(x=boy),fill="green", alpha=0.2,binwidth =10) + geom_histogram(aes(x=girl),fill="blue" , alpha=0.2, binwidth =10) + ggtitle(xlab("boy(green) and girl(blue)")) |

**NOTE: **In the function wilcox.test () we must have to put “paired = FALSE” to get Wilcoxon rank sum test. This test is equivalent to the Mann-Whitney test. So, this is also known as **Wilcoxon–Mann–Whitney test. **

**Wilcoxon Signed-Rank test:**

This is the same as the Wilcoxon rank sum test but this is used for paired data. Suppose we have data on 10 children that how many days they play football and video games in a week. Now, we want to test whether they play video games more than football in a week. As we have the paired data on the same population, so we apply the Wilcoxon Signed-Rank test for testing the claims.

We have the data,

Sl No |
Football |
Video Games |

1 |
5 |
7 |

2 |
2 |
4 |

3 |
6 |
2 |

4 |
0 |
6 |

5 |
6 |
7 |

6 |
1 |
2 |

7 |
6 |
0 |

8 |
0 |
4 |

9 |
5 |
0 |

10 |
4 |
6 |

Here our null hypothesis is there is no difference between median values of the number of days a child plays football and video games against the alternative that the difference between the median values is negative i.e. the median value for football is less than the median value for video games.

So we can easily conduct the test using R,

**CODE:**

1 2 3 4 5 6 |
football <- c(5,2,6,0,6,1,6,0,5,4) vidgame <- c(7,4,2,6,7,2,0,4,0,6) wilcox.test(football, vidgame, mu = 0,alternative = "less", paired = T) |

**OUTPUT:**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
football <- c(5,2,6,0,6,1,6,0,5,4) vidgame <- c(7,4,2,6,7,2,0,4,0,6) wilcox.test(football, vidgame, mu = 0,alternative = "less", paired = T) Wilcoxon signed rank test with continuity correction data: football and vidgame V = 24, p-value = 0.3794 alternative hypothesis: true location shift is less than 0 Warning message: In wilcox.test.default(football, vidgame, mu = 0, alternative = "less", : cannot compute exact p-value with ties |

So, we get the p-value 0.3794 which is greater than 0. 05. So, we can’t reject the null hypothesis at 5% level of significance. So, one can conclude that at 5% level of significance there is not enough evidence to say that children play video games more than football in a week.

Now we can have a view on how the medians of two data are by running the r codes,

1 2 3 |
data = data.frame("game" = rep(c("football","video game"),each = 10), "days" = c(football,vidgame)) boxplot(days~game,data = data, col = c(2,5)) |

From the above box plot, we can see that the medians are likely to be closer to each other.

**Mood’s Median Test****:**

This is a non-parametric test for checking equality of medians of two or more groups. Suppose a new company is started selling t-shirts. Now he is checking the reviews of his products and wants to compare with the other two products if his one is as good as others or not.

Suppose the data is,

Brand |
Nike |
Nike |
Nike |
Nike |
Nike |
Nike |
Nike |
Adidas |
Adidas |

Ratings |
5 | 4.5 | 3 | 2.5 | 4 | 5 | 4.5 | 3 | 3 |

T-shirt |
Adidas |
Adidas |
Adidas |
Adidas |
Adidas |
Adidas |
Adidas |
Adidas |
New |

Brand |
3 | 3.5 | 4 | 4.5 | 5 | 4 | 3.5 | 4 | 4.5 |

T-shirt |
New |
New |
New |
New |
New |
New |
New |
New |
New |

Brand |
2.5 | 4 | 3.5 | 4.5 | 5 | 5 | 5 | 4.5 | 3 |

Now, he is checking the medians of the ratings of the three t-shirts are equal or not. So, he can use Mood’s median test. The null hypothesis is the medians are the same against the alternative at least one equality doesn’t hold.

Now to perform Mood’s median test we have many functions in R like mood.medtest() in *RVAideMemoire *package, median_test() in *coin* package, Median.test() in * agricolae* package.

**CODE:**

1 2 3 4 5 6 |
library(agricolae) rating <- c(5,4.5,3,2.5,4,5,4.5,3,3,3,3.5,4,4.5,5,4,3.5,4,4.5,2.5,4,3.5,4.5,5,5,5,4.5,3) brand <- c(rep("Nike",7),rep("Adidas",10),rep("New",10)) data <- data.frame(brand,rating) Median.test(data$rating,data$brand,alpha = 0.05) |

**OUTPUT:**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
library(agricolae) rating <- c(5,4.5,3,2.5,4,5,4.5,3,3,3,3.5,4,4.5,5,4,3.5,4,4.5,2.5,4,3.5,4.5,5,5,5,4.5,3) brand <- c(rep("Nike",7),rep("Adidas",10),rep("New",10)) data <- data.frame(brand,rating) Median.test(data$rating,data$brand,alpha = 0.05) The Median Test for data$rating ~ data$brand Chi Square = 3.857143 DF = 2 P.Value 0.1453557 Median = 4 Median r Min Max Q25 Q75 Adidas 3.75 10 3.0 5 3.125 4.000 New 4.50 10 2.5 5 3.625 4.875 Nike 4.50 7 2.5 5 3.500 4.750 Post Hoc Analysis Groups according to probability of treatment differences and alpha level. Treatments with the same letter are not significantly different. data$rating groups New 4.50 a Nike 4.50 a Adidas 3.75 a |

Here we get the p-value of the test is 0.1453557 > 0.05. Here we can’t the null hypothesis. So, we have not enough evidence to say that his product is good or bad compared to others.

We use Median.test() because it provides Post hoc analysis, from which we can easily check which medians are not equal if our null get rejected. In the last portion, we can see that some

“a” in the column groups. The groups with the letter “a” have more or less equal medians. If there is another letter “b” then the groups with the different letter have different medians. So, this is how we conduct the median test with post hoc analysis in R.

**RUN Test: **

** Equality of Distributions:
**Suppose we have data of some boys and girls that how many hours per day they spend times watching Television. Here our interest is to know if a boy and a girl equally spend their times in watching television. So, here want to test whether the distribution of a time spent in watching tv of a boy and the distribution time spent in watching tv of a girl is equal or not. So, this is a test of equality of two distributions.

Data: Boys (in hours): 3, 5, 3.5, 4.5, 6, 2, 6, 4, 9, 3, 4, 6

Girls (in hours): 5, 6, 5.5, 8, 6.5, 3, 8, 6.5, 7, 5, 3

So, here our null hypothesis H_{0} is F(x) = G(y), where F() is the distribution function for the boys and G() is the distribution function for girls. But here we don’t have any functional form of F or G. So, we adopt non-parametric test.

Here at first, we combine two samples and then sort them. After sorting, we replace the sorted values with unique characters corresponding to specific groups.

So using R we can have the test as follows-

**CODE:**

1 2 3 4 5 6 7 8 |
library(reshape) library(tseries) boy <- c(3, 5, 3.5, 4.5, 6, 2, 6, 4, 9, 3, 4, 6) girl <- c(5, 6, 5.5, 8, 6.5, 3, 8, 6.5, 7, 5, 3) data <- data.frame(Gender = as.factor(c(rep("boy",length(boy)),rep("girl",length(girl)))), Time = c(boy,girl)) data <- sort_df(data,vars= "Time") runs.test(data$Gender) |

**OUTPUT:**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
library(reshape) library(tseries) boy <- c(3, 5, 3.5, 4.5, 6, 2, 6, 4, 9, 3, 4, 6) girl <- c(5, 6, 5.5, 8, 6.5, 3, 8, 6.5, 7, 5, 3) data <- data.frame(Gender = as.factor(c(rep("boy",length(boy)),rep("girl",length(girl)))), Time = c(boy,girl)) data <- sort_df(data,vars= "Time") runs.test(data$Gender) Runs Test data: data$Gender Standard Normal = -2.343, p-value = 0.01913 alternative hypothesis: two.sided |

So, from the output of the test p-value is 0.01913 which is less than 0.05 but greater than 0.01. So we can say at 5% level of significance that a boy and a girl do not equally spend time in watching television but at 1% level of significance, we don’t have enough evidence to conclude that they spend unequal time in watching tv.

**NOTE:**

Here we can see if the distributions are equal then the number of runs becomes larger.

Here we have smaller runs if the distributions are not equal. From this concept, we use the run test for checking the equality of two distributions.

**Checking Randomness:**

Now consider, some chocolates were given to a class monitor (Rohit) and asked to distribute those to all the students in the class. Now, Mithun (one of his rivals) had doubt on Rohit that maybe he gave his close friends more chocolates than other students. So, how can Mithun check this?

So, where he can have a non-parametric run test to verify whether the class monitor distributed the chocolates randomly (equal distribution to all / no biases) or not.

Here the null hypothesis **H _{0}** is chocolates are randomly distributed and the alternative is “not randomly distributed”.

Now he asked everyone in the class from the first bench to the last benches, how much candy they had received.

So, he got the data,

3, 5, 5, 3, 3, 3, 3, 4, 4, 5, 4, 6, 4, 5, 7, 6, 7, 3, 4, 8, 9, 3, 6, 6, 5, 3, 8, 8, 4, 8, 6, 5.

First, he has found the median as 5. Then replaced the original data by “U” and “D” according to as “>5” and “≤5”. After that, he performs the run test for checking whether the chocolates are independently distributed or not.

We can have this in R as follows,

**CODE:**

1 2 3 4 5 |
library(tseries) x <- c(3, 5, 5, 3, 3, 3, 3, 4, 4, 5, 4, 6, 4, 5, 7, 6, 7, 3, 4, 8, 9, 3, 6, 6, 5, 3, 8, 8, 4, 8, 6, 5) y <- ifelse(x > median(x) , "U" , "D") runs.test(as.factor(y),alternative = "two.sided") |

**OUTPUT:**

1 2 3 4 5 6 7 8 9 10 11 |
library(tseries) x <- c(3, 5, 5, 3, 3, 3, 3, 4, 4, 5, 4, 6, 4, 5, 7, 6, 7, 3, 4, 8, 9, 3, 6, 6, 5, 3, 8, 8, 4, 8, 6, 5) y <- ifelse(x > median(x) , "U" , "D") runs.test(as.factor(y),alternative = "two.sided") Runs Test data: as.factor(y) Standard Normal = -1.1526, p-value = 0.2491 alternative hypothesis: two.sided |

Here **p-value** of the run test is 0.2491 > 0.05. So, he hasn’t enough evidence to say Rohit had a bias for his close friends in distributing the chocolates.