Which Non Parametric Tests to Apply When

While dealing with hypothesis testing, we come across situations where nothing can be assumed about the population distribution, or when the data is not present in representable numerical form (ordinal or nominal data). In such situations, the basic assumptions for the parametric tests don’t hold true and non parametric tests are used.

Nonparametric tests take into account fewer assumptions as compared to parametric tests. They don’t assume anything beforehand about the probability distribution of the population and hence are referred to as distribution-free tests. They are readily comprehensible and easy to use.

The hypotheses which can be tested using nonparametric tests are:

  • Testing whether two independent samples come from identical populations
  • Testing whether the samples are drawn from populations having an identical median
  • Testing the randomness of one sample or two samples
  • To test if a sample comes from specified theoretical distribution

Many tests are available to test these hypotheses but the main question that one encounters is to decide so as to which test would be appropriate. Given below is a list of some nonparametric tests and their application in hypothesis testing.

The Sign Test:

The sign test is the simplest of all and it is quite evident from the name that it is based on the signs (pluses or minuses) of the observations and not their magnitude (the data here is nominal). The sign test can be of two types namely:

  • The one sample sign test
  • The paired sample sign test (two sample sign test)

In one sample sign test, we test whether the sample is drawn from the population with a specified median or not. The only assumption behind the one sample sign test is that the observations are drawn independently from a continuous distribution. The assumption of continuity is important in the sense that it means that no ties should occur, but in practical situations, ties may occur.

In such a situation we ignore the tied observations and the rest of the procedure remains the same. Let us consider an example of the score of students of a particular school which claims that their median score is greater than 60. The scores are as follows:

81, 76, 53, 71, 66, 59, 88, 73, 80, 66, 58, 70, 61, 56, 55

So here we have to test the school’s claim. The null hypothesis is, H0: the median score is 60 (µmedian=60)
against the alternative hypothesis, H1: the median score is greater than 60 (µmedian>60). Now we will
assign each sample value greater than the median (here 60) with a plus sign and each sample value less
than median with a minus sign. Thus, we will have 10 plus signs and 5 minus signs. The value of test
statistic is calculated using the expression

Where n is the number of sample observations. Substituting the value of n=15 in the above expression, we get K=3.20. We compare this value with the number of times the less frequent sign occurs, S (say). In this example minus sign occurs less number of times (S=5).

H0 is rejected if S ≤ K. Here, since S > K (5>3.20) hence we may accept the null hypothesis and conclude that the median of the population is 60. For large samples (n>20), the normal approximation to the binomial distribution can be used. The value of z is given as

Where X is the number of plus signs, n is the number of observations and p is the probability of occurrence of plus sign. Value of p is taken as 0.5 as it is equally likely to get a plus or minus sign. Command SIGN.test() can be used to conduct sign test in R. For the above example the R code is given as:

Notice the p-value (=0.15091) in the above code. It is greater than the significance level of 0.05. Hence, we may accept the null hypothesis.

Sign test also has important application when two paired (dependent) samples are to be tested for the significant difference for a before and after measurement. The test assumes that dependent samples or paired samples are drawn from continuous populations that may be different. The data for this test needs to be of at least ordinal scale for the observations to be compared.

Consider the results of a clinical experiment where a new drug is tested on a group of patients suffering from hypertension. The patients are asked to score their satisfaction level out of 100. A score of zero denoting strong dissatisfaction and a score of 100 denoting strong satisfaction. We proceed in exactly the same way as in one sample case. The scores before and after treatment are as given:

The null hypothesis will be that there is no significant difference between scores before and after treatment. The alternative hypothesis is that the drug has positive effects i.e., the median score for after treatment observations are greater than the before treatment observations.

Calculating the test statistic K as in one sample case we get K=0.784 for n=8 paired observations. Comparing this with S=2 (as less frequent sign appears 2 times), we see that S>K. Therefore we may accept the null hypothesis that there is no significant difference between the before- after samples. The R code for two sample test can simply be run by replacing y=NULL with a y vector as a second sample in the SIGN.test() command. A snippet for the same is as follows:

MANN-WHITNEY U TEST (Wilcoxon rank sum test):

Mann-Whitney U test uses the ranks assigned to the sample observations to determine whether two samples come from the identical population. In this test, we assume that the samples are independent and the observations are at least ordinal for the sake of ranking. If the samples are drawn from identical populations then it can be assumed that the mean of ranks assigned to both the samples is more or less the same.

Let us take an example where two samples A and B are given and we have to test if they are drawn from the identical population. The samples are as follows:

The null hypothesis to test is that the two samples are from the same population against the alternative hypothesis that the two samples are from different populations. We start by assigning ranks to the samples. The ranks are given as follows:

In case of ties, the ranks are given as the average of ranks the observations would have received had there been no ties. The statistics U1 and U2 are for U test are given as:




Where n1 (=12) and n2 (=12) are the sample sizes of sample A and B respectively, R1 (=123.5) and R2
(=176.5) are the sum of ranks of sample A and B respectively. This gives U1=98.5 and U2=45.5. For
comparison, we take U=min {U1, U2}. Here, U=45.5, this value is compared to the tabulated value for
n1 and n2.

For n1, n2>10, the normal approximation can be used with and variance 

Hence, z = -1.59<1.96 (at 5% level of significance), the null hypothesis may be accepted and we may conclude that the two samples are from the identical population.

In R programming, the command wilcox.test() is used to conduct Mann-Whitney U test. The R code for the above example is given below.

We get the p-value as 0.1299 which is greater than the significance value of 0.05. Thus, we may accept the null hypothesis that the two samples are from the same population.

Wald-Wolfowitz Run Test:

Wald-Wolfowitz Run test or simply Run test consists in checking whether a sequence of elements is random or not. This test is based on the theory of runs. We can define a run as a sequence of identical letters preceded and followed by the different letter or no letter at all.

Let us consider the sequence of manufactured items from a production house with good (G) and defective (D) items as follows: GDDGGDGDDDGGDGDDGGDDGDGG. We will first define the null hypothesis as, H0: the given sample is random. Now we have 15 runs (=r), the number of good items =12 (=m) and the number of defective items as 12 (=n). The number of runs, r, has its own sampling distribution with mean and variance given as:

Where n1 and n2 are the sizes of sample 1 and 2 respectively. Therefore the test statistic becomes,

Substituting values of n1 and n2 in the above expressions we will obtain the value of z as 0.834. This calculated value of z then is compared with the tabulated value at α% level of significance. Here, taking the level of significance as 5% we observe that tabulated value (0.834) is less than the tabulated value at 0.05 level of significance (=1.64) therefore we may accept the null hypothesis and conclude that the sample is random. To run this test in R, let us denote the good items as 1 and defective items as 0. Command runs.test() is used to conduct a run test in R.

The null hypothesis may thus be accepted as the p-value (0.4038) is greater than the significance value of 0.05.

KRUSKAL WALLIS One Way Analysis Of Variance By Ranks

The Kruskal Wallis one way analysis of variance is a useful test when several independent samples are involved. It helps in deciding whether k (>2) independent samples are from the same population or identical population with the same median or not.

It is assumed that the observations are independent and at least ordinal. In a similar fashion to Mann-Whitney U test, this test also begins with ranking the observations in ascending order. Now, the average ranks of the samples must be about the same if they are from the same populations. To test if they are about the same the Kruskal Wallis test statistic is given as:

Where; k is the number of samples

  • nj is the number of observations in jth sample
  • N is the number of the observations in the combined sample 
  • Rj is the sum of the jth sample

The sampling distribution of KW statistic can be well approximated by ꭓ2 distribution with (k-1) degrees of freedom when the number of samples (k) is more than 3 and the number of observations in each sample exceeds 5.

Let us suppose a factory installs three machinery units and wants to determine if the output of the machines varies significantly or not. The output for different machines being given as:

Machine A : 80, 83, 79, 85, 90, 68
Machine B : 82, 84, 60, 72, 86, 67, 91
Machine C : 93, 65, 77, 78, 88

The ranks allotted for the above data are given as:
Ranks of machine A: 9, 11, 8, 13, 16, 4
Ranks of machine B: 10, 12, 1, 5, 14, 3, 17
Ranks of machine C: 18, 2, 6, 7, 15

The average of ranks of machine A, B and C are 10.16, 8.86 and 9.6. Putting these values in the expression of test statistic we get, KW = 0.197 which is less than the tabulated value of ꭓ2 2, 0.05 (=5.991). Hence, H0 may be accepted at 5% level of significance.

Following is the R code for running a Kruskal-Wallis test.

Kolmogorov Smirnov Test:

The Kolmogorov Smirnov test is a test of goodness of fit. It tests whether a sample comes from a specified theoretical distribution. This test is concerned with the degree of agreement between the distribution of a set of values and some specified theoretical distribution. This theoretical distribution is assumed to be continuous. Let F0(x) be specified cumulative relative distribution i.e. for any value of X the value of F0(x) is the proportion of cases expected to have values equal to or less than x (X≤x).

Also, let S0(x) be the observed cumulative distribution function. Now, the null hypothesis is stated as the sample has been drawn from the specified theoretical distribution. For H0 to be true we would expect the differences between F0(x) and S0(x) to be small. The Kolmogorov Smirnov test focuses on the largest of the deviations. Thus, the test statistic is given as:

The value of this statistic is then compared with the tabulated value of D at α% level and H0 is rejected if the calculated value is greater than the tabulated value otherwise it is accepted.

Let us have a look at an example where observed and predicted observations are given and we have to test if the predicted sample can be thought to have come from the theoretical distribution (observed sample). We first find the relative cumulative frequency by dividing the observations of observed and predicted sample by 683 and 683.2 respectively. Then we have to find the difference between relative cumulative frequencies of observed and predicted values.

The maximum of these values (=0.015) is the value of the test statistic D. Tabulated value of D at n=10 and 5% level of significance is 0.409. Since the calculated value is less than the tabulated value, therefore we may accept the null hypothesis. The R code for Kolmogorov Smirnov test is as follows:

As seen above, non-parametric tests are reasonably practical and straightforward. Although they don’t take into account many assumptions and can be applied to both small and large samples, the main disadvantage lies in the fact that they have less statistical power as compared to the parametric tests i.e., they are unable to strongly reject the null hypothesis when the alternative hypothesis is true.

Also, it is disadvantageous to use non-parametric methods when assumptions of parametric methods are met and the data are measured on interval or ratio scale. Having said that, it is recommended to use parametric tests whenever possible and if not then non-parametric will always be up for use.


You might also like More from author