AB Testing With R: An Example Of Marketing Campaign
In this article, we will learn the concepts and implementation of AB Testing using R. Marketing campaigns are meant to influence a targeted audience and encourage them to purchase a product. In this process, a lot of questions arise into the minds of a retailer. For e.g. which advertisement leads to more sell? Does a good discount percent really attract more customers? Which slogans would be better?
Even ecommerce companies in India like Amazon and Flipkart have a lot of questions about their websites, application designs, and marketing strategies. These questions can be answered by conducting an A/B test.
Working Of The A/B Test
When comparing two versions of products (such as A and B ) for similar customers are tested to see which group should sell more in the market or sometimes two groups of customers A and B for similar products, to see which group we should target for the products, we use A/B testing.
For example for a website:
Null Hypothesis: Assumption that there is no difference between the conversion rates for products A and B
Alternative Hypothesis: There is a difference between the conversion rates for products A and B
To reject the Null Hypothesis we need a pvalue that is lower than the significance level i.e. P < 0.05
1 2 3 4 5 6 7 
install.packages ("pwr") library (pwr) ######## 2sample test for equality of proportions ############ prop.test(c (225, 250), c (3450, 3000)) 
The pvalue is less than 0.05, so we can reject the hypothesis that conversion rates are equal.
But one cannot directly conclude that A and B have dissimilar conversion rates or vice verse. Here true underlying behavior is not known as we are trying to test the hypothesis by carrying out the experiment over a sample.
The Disadvantages of Using A/B Test:

 The A/B test considers a sample data of the whole population at a certain point of time. So the test is limited to a point of time. Secondly, the sample data may not tell us the true conversation rate of the original population.
 Moreover, the A/B testing simply works on the hypothesis by calculating the pvalue. This means that it fails to answer the questions such as “how likely it is that B is similar to or better than A and by how much?”
Bayesian A/B Test
Bayesian statistics in A/B testing is mainly based on past or prior knowledge of similar experiment and the present data. The past knowledge is known as prior also prior probability distribution (Wiki) is combined with current experiment data to make a conclusion on the test at hand.
In this method, we model the metric for each variant. We have prior knowledge about the conversion rate for A which has a certain range of values based on the historical data. After observing data from both variants, we estimate the most likely values or the new evidence for each variant.
Now we need to know:
What is Posterior Probability Distribution?
Posterior probability is the probability of an event to happen after all the background information about the event has been taken into account. Posterior probability as an adjustment on prior probability:
Posterior probability = prior probability + new evidence (called likelihood). And the Posterior Probability Distribution is Posterior Distribution = Prior Distribution + Likelihood Function (“new evidence”)
Open the link for further information: Wiki
By calculating this posterior distribution for each variant, we can express the uncertainty about our beliefs through probability statements.
1 2 3 
install.packages (“bayesAB”) library (bayesAB) 
The link below contains all the information to explain the parameters and functions in the package bayesAB. CRAN
Using the previous example
1 2 3 4 
library (bayesAB) A_binom < rbinom (3450, 1, 0.065) B_binom < rbinom (3000, 1, 0.083) 
About rbinom function rbinom (n, size, p) where
n = number of observations
size = number of trials
p = vector of probability
We choose the alpha and beta level from the prior knowledge we had about parameters. Here I have shown the test with two levels of the values. We generally use trial and error method to get the distribution to look like our imagined prior distribution. The peak should be centered over our expected mean based on previous experiments.
1 2 3 4 5 6 
plotBeta (1, 1) plotBeta (100, 200) ## more specific range of p AB1 < bayesTest (A_binom, B_binom, priors = c ('alpha' = 1, 'beta' = 1), distribution = 'bernoulli') 
Saving the outputs of the test in AB2
1 2 3 4 
AB2 < bayesTest (A_binom, B_binom, priors = c ('alpha' = 100,'beta' = 200), distribution = 'bernoulli') 
Here I have checked the AB2 test with an alpha and beta value of 100 and 200 respectively. You can also check the plots and results for AB1.
Print tells us the inputs we have made and the summary statistics of the data.
print (AB2)
summary (AB2)
The summary gives the credible interval. Bayesian intervals treat their bounds as fixed and the estimated Parameter as a random variable, whereas frequentist confidence intervals treat their bounds as random Variables and parameters as the fixed value.
It also shows that P (A>B) is by 0.00068%. So, B is much better than A. And the posterior expected loss for choosing B over A is low.
plot (AB2)
The means are quite separate, but there is a minimum overlap between distributions. Credible interval highlights this overlap region. To quantify the findings we calculate the probability of one variation beating another i.e. if we randomly draw a sample from Product A and from Product B, what are the chances that sample from B would have higher conversion rates than that of A.
So, from the diagrams and the summary of the test we can easily solve the problems which we had faced earlier while doing a simple prop.test.
Similarly, we can also try the test for other specific distributions like Poisson, normal, exponential and etc and check the results for them. Then we can combine the results of the tests and find out an overall credible interval and a percentage of A Over B or vice versa.
Advantages Of Bayesian A/B Test
A/B test approaches are centered on hypothesis tests used with a point estimate (probability of rejecting the null) of a hardtointerpret value. Oftentimes, the statistician or data scientist laying down the groundwork for the A/B test will have to do a power test to determine sample size. This quickly gets messy in terms of interpretability. More importantly, it is simply not as robust as Bayesian A/B testing and it does not have the ability to inspect an entire distribution over a parameter.
Bayesian statistics is simply more powerful and informative than a normal A/B test. While frequentist A/B testing requires the length of the test to be defined in advance, Bayesian testing does not. It can calculate the potential dangers of ending the test (the loss value) at any point, and gives a constantly updated probability of either variant being better and by how much. Ending the test early can be disastrous for frequentist A/B testing. A Bayesian approach, therefore, provides us with much greater flexibility during the experiment.
Disadvantages Of Bayesian A/B Test
There is no agreed method for choosing a prior and it requires skill to estimate subjective prior beliefs into a mathematically calculated prior. If not done correctly it could lead to misleading results. The posterior distribution can be heavily influenced by the selection of the prior and the selection of the prior is a subjective process. Moreover, Bayesian statistics require a high level of computational resource, particularly in models with a large number of parameters.
Conclusion
The main advantage of the Bayesian approach is the ability to include historical data and to select a prior distribution. The main disadvantage with this approach is the subjective nature of the selection process for the prior.