# Design Of Experiment: Completely Randomized Design

## Design Of Experiment: Completely Randomized Design

**Introduction:**

Experimentation and making inference are twin essential features of general scientific methodology. After setting up a statistical problem, we perform experiments for collecting information on the basis of which we infer something about.

Although sample survey and design of experiments concerned with data collection, we use them for different purposes. In sample survey we derive the methods for collecting representative samples from a population such that we can interpret the characteristics of that population. In design of experiment, no such population exists. Here we have to define the experimental units which are to be used to perform the experiments. So in design of experiment, experimenter has the control on the experiment but in sample survey there is no such control, sample observations occur in nature and cannot be subjected to any experimental control. Suppose we want an estimate on adult height of a city. This is a problem of sample survey. Here we decide the sampling technique, collect the data and infer the height of that population. Now, suppose we want to know which of five given varieties of rice is expected to give the maximum yield in the long run, we have to conduct an experiment.

Before discussing the principles of designs, it is proper to explain the terminology used in this context. The commonly used terms are experiment, treatment, experimental unit, experimental error and precision.

**Experiment** – An experiment is getting an answer to the question that the experimenter has in mind. In planning an experiment, we clearly state our objectives and formulate the hypotheses we want to test.

**Experimental unit** – An experimental unit is the material to which the treatments are to be applied and on which the variable under study is measured

**Treatment** – The procedures/objects under comparison in an experiment are the treatments.

**Experimental errors** – There is always a variation during an experiment. Some of the variations can be controlled and the other part which is random is called experimental error.

**Precision** – It is measured by the reciprocal of the variance of a mean i.e.

As n (number of observations) increases, precision increases.

# Basic principles of design:

There are three basic principles or design –

(i) Randomization (ii) Replication (iii) local control.

** Randomization :** It is essential for getting a valid estimate of random experimental error. It minimizes the bias in the experiment.

* Replication *: As we see as many as we increase the replications (observations) the error variance decreases and as a result precision increases.

* Local control *: The third principle is local control or error control. Randomization and replication minimize the experimental error.

**Completely Randomized Design**

CRD is the simplest design where replication and randomization is used. Suppose we have t levels of a factor each with **r**i replications, i =1,2,…,t. Total number of experimental units is n = i (for simplicity we will take all **r**i = r). Here we allocate the treatments in n completely at random. We can look this design as one-way ANOVA model.

The model we consider here is,

** Let us consider an example:**

There are 3 levels of a factor A, B, C. We want to test their significance. Let us take a sample of six observations for each level as shown below.

By using R programming we can easily test the significance of the levels

1 2 3 4 5 6 7 8 9 10 11 12 |
A <- c(22,42,44,52,45,37) #Observations of level A B <- c(52,33,8,47,43,32) #Observations of level B C <- c(16,24,19,18,34,39) #Observations of level C x <- data.frame(A,B,C); r <- c(t(as.matrix(x))) f <- c("Item1", "Item2", "Item3") k <- 3 #number of levels n <- 6 #number of observatons in each levels levels <- gl(k, 1, n*k, factor(f)) #Matching treatments a <- aov(r~levels); summary(a) |

**OUTPUT:**

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
A <- c(22,42,44,52,45,37) #Observations of level A B <- c(52,33,8,47,43,32) #Observations of level B C <- c(16,24,19,18,34,39) #Observations of level C x <- data.frame(A,B,C); A B C 1 22 52 16 2 42 33 24 3 44 8 19 4 52 47 18 5 45 43 34 6 37 32 39 r <- c(t(as.matrix(x))) f <- c("Item1", "Item2", "Item3") k <- 3 #number of levels n <- 6 #number of observatons in each levels levels <- gl(k, 1, n*k, factor(f)) #Matching treatments a <- aov(r~levels) Call: aov(formula = r ~ levels) Terms: levels Residuals Sum of Squares 745.4444 2200.1667 Deg. of Freedom 2 15 Residual standard error: 12.11106 Estimated effects may be unbalanced summary(a) Df Sum Sq Mean Sq F value Pr(>F) levels 2 745.4 372.7 2.541 0.112 Residuals 15 2200.2 146.7 |

Now **p-value is 0.112** which is greater than 0.05, the desired level of significance.

So we accept (cannot reject) the null hypothesis. That means there is no significance difference between the three levels A, B, C.

**Author: Rahul Roy **