# CT3 Probability and Mathematical Statistics

*Are you appearing for CT3 this diet? Or have you cleared CT3 and preparing for interviews? Does the thought of going through the long list of formulae and concepts a day before the exam or interview sound daunting? Let’s make life simpler!*

**A brief run through CT3 Interview questions, Points to remember and Quick tips for the exam**

**A brief run through CT3 Interview questions, Points to remember and Quick tips for the exam**

Though the exam involves solving problems and it’s not a theoretical paper, barring 1 or 2

questions which are usually definitions, these theoretical (rather conceptual) questions are

often asked in the interview:

**Q. What is the difference between data and information?**

**Ans:** Data is facts or figures from which conclusions can be drawn. When the data is processed and transformed in such a way that it becomes useful to the users, it is known as ‘information’. For example, the weight of each individual in your classroom is data, whereas, the number of people in each weight category is information.

**Q. What is a random variable?**

Ans: Random (associated with a probability) variable (it takes different values). To put it neatly, it is a variable whose value is subject to variations due to chance.

If the random variable can take a countable number of distinct values, then it is termed as a discrete random variable. For example, consider tossing of two coins and consider the random variable, X to be the number of heads observed. The possible values taken by the random variable are 0, 1, and 2 which is discrete.

If the random variable can take an infinite number of values in an interval, then it is termed as a continuous random variable. For example, the height and weight of the students in a class, annual sales of a firm, the temperature of a city.

**Q. What are generating functions?**

Generating functions provide a neat way of working out various properties of probability distributions without having to use integration repeatedly. They can be used to find mean, variance, higher moments of a probability distribution, distribution of a linear combination of independent random variables and determining properties of compound distributions.

**Q. What is the difference between **probability** generating function (PGF) and moment generating function (MGF)?**

**Ans:** The names give the game away: PGFs are used to generate probabilities, MGFs are used to generate moments.

A probability generating function (PGF) can be used to generate a set of probabilities, namely the probabilities associated with the values 0, 1, 2, 3, … assumed by a counting variable which assumes non-negative integer values.

A moment generating function (MGF) can be used to generate moments of the distribution of a random variable (discrete or continuous).

**Q. Explain the concept of the p-value in layman terms?**

**Ans:** Suppose a restaurant claims that their delivery times are 30 minutes or less on average but you think it’s more than that. You conduct a hypothesis test because you believe the null hypothesis, Ho, that the mean delivery time is 30 minutes max, is incorrect.

Your alternative hypothesis (H1) is that the mean time is greater than 30 minutes. You randomly sample 100 delivery times and observe that delivery times are more than 30 minutes only twice. So your p-value (probability value) turns out to be 0.02, which is less than your significance level, 0.05. In real terms, there is a probability of 0.02 that you will mistakenly reject the pizza place’s claim that their delivery time is less than or equal to 30 minutes.

**Q. What do you mean by 95% confidence interval?**

**Ans:** Confidence interval tells you how confident you can be that the results from a poll or survey reflect what you would expect to find if it were possible to survey the entire population. Confidence intervals are intrinsically connected to the confidence level. Confidence levels are expressed as a percentage (for example, a 95% confidence level). It means that should you repeat an experiment or survey over and over again, 95% of the time your results will match the results you get from a population.

For example, if we measure the heights of 40 randomly chosen men and get a mean height of 175 cm and a standard deviation of 20 cm. Suppose the 95% confidence interval is (168.8,182.2), then it means that 95% of experiments like we just did will include the true mean, but 5% won’t.

**Q. What is the difference between t-test and ANOVA?**

**Ans:** When the population means of only two groups are to be compared, the t-test is used, but when means of more than two groups are to be compared, ANOVA is used.

**Q. What is the use of R2 in regression?**

**Ans:** R-squared is a goodness-of-fit measure for linear regression models. It indicates the percentage of the variation in the dependent variable that the independent variables explain collectively.

*It’s said that practice makes a man perfect, but in reality, no one is perfect. Let’s have a look at some commonly made mistakes and some important points to remember while attempting the exam:*

- For the histogram, check if all class intervals have an equal class width. If not, calculate frequency density by using the formula:

**Frequency Density = Frequency/Class Width** - Read the questions involving probability tree diagrams very carefully. Though this applies to all questions, these questions are deceptive at times, so re-verify your diagram and calculations.
- In questions involving writing the cumulative distribution function (CDF), don’t forget to mention the complete CDF i.e. stating the range for which the CDF is 0 and 1.
- The first order derivative of the probability generating function (PGF) evaluated at 1 gives the meanwhile the first order derivative of the moment generating function (MGF) evaluated at 0 gives the mean. Always check if you are calculating mean from MGF or PGF and accordingly substitute the value at which the derivative should be evaluated.
- Continuity correction is used when you use a continuous probability distribution to approximate a discrete probability distribution.
- In regression, for ease of calculation, try to use the change of scale and origin when the values of the predictor variables are large.

**Some quick tips for CT3 preparation:**

- Make a formulae sheet for each chapter so that you have a quick summary to look at while revising and when you get stuck while practicing problems.
- Solve all the questions given in the course notes and revision notes. Try solving the revision notes questions immediately after finishing the chapters corresponding to a particular revision notes booklet. Try not to pile it up to a month before the exam.
- While practicing questions involving actuarial tables, involve them! Don’t leave the solution halfway till the probability expression, always calculate the final answer. Though it sounds funny, there are few who fall short of time while searching for the relevant statistical distribution table and formulae in the actuarial table, as they don’t have a practice of doing it.

CT3 is a relatively easy exam and it’s not very difficult in terms of conceptual understanding, making it the easiest subject when it comes to preparing for interviews. Coming to the exam, just be regular in your studies, practice every day and most important: don’t be overconfident while attempting the exam…No one can stop you from clearing CT3!

**All the best!!**