Introduction To Descriptive and Predictive Analytics Part – 1
Modelling is what we most often think of when we think of data mining. Descriptive and Predictive Analytics is the process of taking some data (usually) and building a simplified description of the processes that might have generated it. The description is often a computer program or mathematical formula. A model captures the knowledge exhibited by the data and encodes it in some language.
Often the aim is to address a specific problem through modelling the world in some form and then use the model to develop a better understanding of the world.
Part – 1: Descriptive Analytics
 An Operational Decision Problem
 Forecasting with Past Historical Data
 Moving Averages
 Exponential Smoothing
 Thinking about Trends and Seasonality
 forecasting for New Products
 Fitting distributions
Descriptive Analytics
 Before we dive into analyzing data, let us a look at a fundamental problem that firms face
 Operations problem:
 How much to produce?
 We need to know or estimate the cost of the product, the price of the product, and some data on the demand for the product.
 Let us explore a problem to get started.
A Fundamental Operations Problem: An example
 Suppose that you are making operations decisions for a retailer who orders a product from a supplier and sells it to customers.
 The ordered product items are received and placed on the store shelf.
 There is a large customer population
 Each customer may choose to buy or not buy the product.
 If the customer chooses to buy, he arrives at the store to buy the product.
 He buys it as long as it is available on the shelf.
 However, you have to order the product before you see the customer demand since you have to have the items available on the shelf.
 You get only one chance to order (i.e., you cannot change your purchase order after your decision).
An Operations Problem: Costs
 You order the product from the supplier at cost = 3 talers/item. (Talers are the currency units).
 After your order is received and placed on shelves, demand occurs.
 The product on the shelf sells at price = 12 talers/item.
 All unsold items are salvaged. Salvage value =0 talers/item.
 Let us look at the timeline of events.
Timeline of Events
Demand is uncertain. Suppose you bought 10 items
 A High Demand Scenario: Demand is 100. You will sell all 10 items, and make a profit of 10*(123) = 90 talers.
 A Low demand Scenario: No demand (i.e., demand = 0). You sell nothing and lose 10*3=30 talers.
Problem Recap
You don’t know what the demand is going to be
You have to decide on the number of units to order from the supplier before seeing the customer demand.
What could help?
 Past demand data
 Fortunately, we have the demand data from past 100 periods.
The chart shows the demands (yaxis) observed in past 100 periods (xaxis).
Past Demand Data
 Some more information from past demand data
 From the observations over the past 100 such periods.
– Maximum Demand observed was 81.
– Minimum Demand observed was 15.
– The arithmetic average of those 100 observations is 52.8  Based on the data, I am going to ask you to go through an exercise
– On deciding how much to order.
Before you make your decision
 There is no penalty for a wrong answer, or conversely, no extra course credit for the right answer.
 You get one attempt at making your decision.
 The objective of the exercise is not to test or grade you, but to set a baseline initial thinking as we start the course.
 Write down your answer on a sheet of paper and keep the sheet through the course.
 We will see the best answer and you will then get a chance to compare your answers and calibrate learning progress.
The problem you just saw is called a Newsvendor problem.
Its characteristics are:
 You have an objective (usually maximize profits, minimize costs, improve market share, etc.)
 You have to make one decision (usually, how much to buy, or plan for).
 before you see the future demand
 Demand occurs, and profits and costs are realized.
This is called the newsvendor problem:
Because it is similar to a vendor who sells newspapers:
 Buy too much and you may be left with unsold newspapers,
 Or buy too little, and you will forgo revenue opportunity.
In this course, we will show you how to think about and analyze this problem
A Business Application at Time Inc.
Time Magazine Supply chain:
Stores were either selling out inventories (too little inventory) Or sold only a small fraction of allocation (too much inventory). Time Magazine evaluated and adjusted for every issue:
National print order (total number of copies printed and shipped), Wholesale allotment structure (How those copies are allotted to wholesalers). Store distribution (Final distribution to stores).
Note: the above three decisions are made before the actual demand is realized
Need to analyze past data Forecast future demand. Time Magazine reports saving $3.5M annually from tackling the newsvendor problem.
Broader applications of the Newsvendor problem
Governments order flu vaccines before the flu season begins, and before the extent or the nature of the flu strain Is known
 How many vaccines to order?
Smartphone users buy mobile data plans before they know their actual future usage  What is the right plan for you?
Consumers buy health insurance plans, before they know their actual health expenditures
 How to think about the right plans?
For all the above examples: some forecast of future demand is essential
Introduction to Forecasting
What is forecasting?
Primary Function is to predict the Future
Why are we interested?
Dictates the decisions we make today
Examples: who uses forecasting in their jobs?
 forecast demand for products and services
 forecast inventory and capacity needs daily
What makes a good forecast?
 It should be timely, reliable.
 It should be as accurate as possible, and
 It should be in meaningful units
 The method should be easy to use and be understood in practice.
Characteristics of Forecasts
Point forecasts are usually wrong! Why?
1. Examples: In December 2015, there will be 37cms of snow.
2. We will sell 314 umbrellas during the rains next Part.
3. Demand could be a random variable.
Therefore, a good forecast should be more than a single number
1. Mean and standard deviation
2. range (high and low) (e.g. weather forecasts).
Modeling Uncertain Future: Probability Distributions
 We often do not control purchasing behavior as a result, we cannot predict future demand with certainty
 How do we describe uncertain future demand?
 We can try to decide what future demand scenarios are possible, for each scenario, estimate the likelihood of its realization
 Where do scenarios come from?
1. Past data
2. Expert estimates
An Example of a Model of Future Demand
 Lets start by looking at a small number of scenarios, say, three: high demand, ordinary demand and low demand.
 Lets say that the high demand scenario corresponds to the demand value of 80, ordinary demand scenario to the value of 50, and low demand scenario – to a value of 20
 For each scenario, a likelihood of its occurring must be estimated
Example of a Model of Future Demand: How likely is Each Scenario?
 For each scenario, a likelihood of its coming true must be estimated
 Where do estimates of likelihood come from?
1.Statistical analysis of past data  Suppose that after analyzing the past data and using subjective inputs, we estimate that scenarios have the following likelihoods of being realized in the next selling season:
 Likelihood of high demand is 20%
 Likelihood of normal demand is 70%
 Likelihood of low demand is 10%
Three Scenarios and Probability Distribution
In other words, we project that the demand is not equal to a certain number with probability 1, but, rather can take one of three values with those probabilities
We have just created a probability distribution for the future demand:
 D1 = 80 with probability 1=0.2
 D2 = 50 with probability 2=0.7
 D3 = 20 with probability 3=0.1
Probability distributions like that one, described by a number of distinct scenarios with attached probabilities, are called discrete Note that the probabilities are:
 greater than zero, and
 they sum up to 1.
In other words, we project that the demand is not equal to a certain number with probability 1, but, rather can take one of three values with those probabilities Three Scenarios Probability Distribution: Scenarios and Their Probabilities
Describing Probability Distribution: Mean and Standard Deviation
 For any probability distribution, including a simple one reflecting three demand scenarios, two useful descriptive quantities are often calculated: mean (also called expected value) and standard deviation
 For a discrete probability distribution, the mean is defined as a sum of the products of scenario values and their probabilities
 For our demand distribution, the mean D=1D1 + 2D2 + 3D3 = 0.2∗80+0.7∗50+0.1∗20=53.
 Mean reflects the demand value that we will get, on average, in a selling season, if we keep observing the demand realizations over the infinite number of selling seasons
Three Scenarios Probability Distribution: Mean
Describing Probability Distribution: Mean and Standard Deviation
 Standard deviation describes, roughly speaking, how far away actual random variable values are from the mean, on average. In other words, it describes how, in a colloquial sense, spread out the distribution is around its mean
 Standard deviation is defined as a square root of the sum of products of scenario probabilities and the squares of the difference between scenario value and the mean value
 For example, for the threescenario demand probability distribution we consider, the standard deviation is calculated as
Three Scenarios Probability Distribution: Mean and Standard Deviation
Knowledge of mean and standard deviation values helps to support a general intuition about the nature of a random variable
Mean and Standard Deviation: More than three scenarios
 What if we have more than three scenarios?
– D1 with probability 1
– D2with probability 2
– D3with probability 3
————————–
– Dn with probability
And 1+2+3+⋯+=1
 What about the mean and standard deviation of this demand distribution for n scenarios?
Discrete vs. Continuous Probability Distributions
 So far, we have looked at a discrete probability distribution with a number of future scenarios with attached probability for each scenario
 But what will happen to a discrete probability picture when
The random variable being modelled has a really large number of scenarios on any small interval of the possible interval of values and
The probability that any one scenario is realized is really small
 Think of examples such as stock prices, or the amount of rainfall in a region.
 In such cases, it makes sense to describe such probability distribution using groups of scenarios rather than focusing on individual scenarios
Continuous Distribution: Random Variable X
Normal Distribution
 One of the most popular examples of a continuous probability distribution is normal distribution
 Allows the underlying random variable to take any value from negative infinity to positive infinity, and
 Is completely characterized by two parameters mean and standard deviation s.
There exist statistical formulas (also implemented in Excel) that calculate a probability that a normal random variable X with given mean and standard deviation s produces a value within a specified interval of values
[Xmin, Xmax]
Other Continuous Probability Distributions
 There exist a large number of other popular continuous probability distribution: exponential, beta, etc. with easily computable mean and variance/standard deviation
 Each of those distributions is often used to describe specific uncertain setting/quantity
 For example, the normal distribution is used to describe a distribution of a future relative (percentage) changes in the values of stocks, FX rates
 Another example: exponential distribution can be used in characterizing time between successive arrivals of customers in service systems (e.g. call centres).
Returning back: Characteristics of Forecasts
1. Point forecasts are usually wrong! Why?
 Demand could be a random variable
2. Therefore, a good forecast should be more than a single number
3. Forecasts should include some distribution information

 Mean and standard deviation
 Range (high and low)
4. Aggregate forecasts are usually more accurate
5. The accuracy of forecasts erodes as we go further into the future
6. Don’t exclude known information
Subjective Forecasting Methods
 Composites
 Sales Force Composites: Aggregation of sales personnel estimates.
 Election Polling Composites: sites that aggregate polls.
 Customer Surveys
 The jury of Executive Opinion
 The Delphi Method
 Individual opinions are compiled and reconsidered. Repeat until overall group consensus is (hopefully) reached.
 We will return to subjective forecasting methods at the end of Part 1 (Last Session).
How to forecast with past data, objectively?
We can leverage past data to come up with forecasts:
Two primary methods:
 Causal models
 Time series methods
Causal Models
– Let D be the demand or future outcome to be predicted and assume that there
– Are n variables (or root causes) that influence the demand.
– A causal model is one in which demand D is formulated as a theoretical function of all those n causes.
– Causal models are generally intricate and complex and need advanced tools in addition to domain expertise.
– In this course, we will focus mainly on time series based models.
Time Series Methods
 A time series is just a collection of past values of the variable being predicted.
 Can be considered as a nave method. The goal is to isolate patterns in past data.
 Past data might have characteristics such as:
1. Trend
2. Seasonality/Cycles
3. Randomness
Next…
Forecasting with past historical data
Moving Averages
Exponential smoothing