Random variables & distributions

Chapter 2

Random variables & distributions

As analysts, we work with discrete random variables or continuous random variables.

X values are 0,1,2,3,4,and 5 radios sold per week (column B.) The probability P(x) of radios sold are in column C (i.e. 0 radios sold is 3%; 1 radio sold is 20%, 2 radios sold is 50%…)

In column D (x*p(x))

we multiply each value

in column B by each

value in column C

For each X value in column B we subtract 2.1 and square the difference

We multiply this value by the p(x) value in column C


We copy and paste the equation in E6 to cells E7 –E11.

……Then sum E6-E11 values


The probability radios sold are 2

standard deviations around the mean

are 0.2 + 0.5 + 0.2 = 0.9 or 90%

As an analyst you will be faced with 2 special types of discrete probability distributions.



Binomial distributions occur when you are faced with a number of successes in a sequence of n independent binary (yes/no) outcomes, each of which yields success with probability p.

Examples of Binominal Distributions in Business

A company is making transistors. Every hour a supervisor takes a random sample of n=5. The probability p(x) a transistor is bad is 0.15.

What is the probability of finding r= 3,4 or 5 bad transistors?

P (X=r) in Excel is binomdist(…)

Number is 3; trials is 5; probability is 0.15, cumulative is False since we want to know the probability of finding exactly r=3 bad transistors

P(X=r) for 3 (e.g. .024), 4 (.002) and 5 (.0001). We add up all 3 values to find probability of 3 or more defects .0266 or 2.66%.

We could have done this another way using the TRUE cumulative function in Excel…

Since n=5, the probability of 3 or more transistor defects is the same as 100% minus the probability of 2 or fewer P(x≤2).






since we are looking at x

of 0,1 and 2


The expected (µ = mean) value of a binominal distribution is n*p (in our example µ= 5*.15=.75 defective transistors)

But, how do we know if the binomial distribution is normal?

It depends……on the number of trials (n) and the probability of success (p)

If np(1-p)≥10 the binomial

distribution is a bell shaped

normal distribution

In our example, the probability (p) of finding a bad transistor is 0.15….so our binominal distribution is skewed right. We cannot use σ and µ to define a defect range.

Poisson distributions are discrete probability distributions that express the probability of a number of events occurring in a fixed period of time if these events occur (1) with a known average rate µ and (2) independently of the time since the last event.

Examples of Poisson Distributions in Business

Let’s say defects in a factory occur randomly at an average rate (µ) of 1.8 defects per hour.

What is the probability p(x) of observing x=4 defects in a given hour at the factory?

In Excel, the formula is =Poisson(…)



Cumulative =False since we are concerned only with x=4 defects

The probability of 4 defects happening in a given hour is 7.2%

What is the probability of observing 2 or less P(x≤2) defects in a given hour at the factory?




Cumulative= True since we are concerned with 0,1 & 2 defects in an hour

The probability of observing 2 or less P(x≤2) defects in a given hour at the factory is 73%

Can we find a range of values for the # of defects?

It depends on the average rate of occurrence µ per unit time

As the average rate of occurrence in a Poisson distribution increases so does the spread (i.e. standard deviation σ)




Up till now we have been talking about discrete distributions.

With continuous probability distributions we figure out the probability a random variable (x) will fall in an interval (a-b)

In a continuous probability distribution:

f(x) is ≥0 for all values of x

The total area under the curve f(x)=1

There are different types of continuous probability distributions. We will look at 4:





A normal distribution is one where the data is evenly distributed around the mean, which when plotted results in a bell curve

Why are normal distributions important?

Because of the Empirical Rule….

So, if we knew on average ( µ ) company sales were $10,000/day and the standard deviation ( σ ) in sales was $2,000, there is a:

68% probability sales are from $8,000 to $12,000

95% probability sales are from $6,000 to $14,000

99.7% probability sales are from $4,000 to $16,000

We can “standardize” normal distributions to tell us how many standard deviations (𝞂) the value (x) is from the mean (µ)

Suppose travel time to work (in minutes)

Ok, so z scores tell us how many standard deviations we are from the mean. Why is that important?

Remember, the Empirical Rule relates standard deviations to probability so…

Based on Z scores we can calculate probabilities of occurrence.

Suppose build time for a product is normally distributed with an average of 100 days & standard deviation of 20 days. The sales team promises the customer no more than a 125 day built time. What is the probability the factory can produce on-time?

In Excel, we can calculate the z score..


The P(Z≤1.25) in Excel =normdist(..)

C5=125; C3=100; C4=20; cumulative is True

since we are looking at all values through 125

For a product which takes on average 100 days to build with a 20 day standard deviation there is an 89.4% chance it can be built in 125 days.

Now, suppose you were a Quality manager in a factory. A process change was made. You want to know if the change reduced process variation.

From samples collected before and after the process change, suppose the average (µ = 52) is the same for both group but the standard deviations (σ = 6 vs 12) are quite different.

Is the different large enough to say population variance is different?

The F distribution tests for differences in variance at a specified level of confidence (α).

Let’s look at an example.

What’s a null hypothesis?

What does a confidence α = .05 mean?

Ok. But what’s Alpha ( α )?

An α of 5% means there’s a 5% chance we say variance changed when

in reality it didn’t.

So, there’s a 5% chance we’re wrong and 95% chance we’re right..

To find the critical F we need to know the shape of the F distribution.

The shape depends on how many degrees of freedom our sample numbers have

At different degrees of freedom (df) the shape of the F distribution changes.

What are degrees of freedom?

Degrees of freedom are the number of values that have the freedom to vary.

For example, a student needs to take nine courses to graduate, and there are only nine courses offered the student can take. There are eight degrees of freedom. Why? The student is able to choose classes one through eight in any order; but after taking these 8 classes we know what the ninth class must be…it’s is the only class left.

The degrees of freedom (df) for group 1 (v1) is n1 – 1 = 9 – 1 =8

The degrees of freedom (df) for group 2 (v2) is n2 – 1 = 7 – 1 = 6

Since our calculated F value (5) is less than the critical F value (5.59) we cannot reject the null hypothesis that the variances in the 2 groups is equal

What if our manager said to test if process variance was less after the process change.

In this case we do not have a 2 tailed test since we are only interested in less than. This is now a one directional test.

Since the calculated F(5)

is in the rejection region (>4.15) we reject H0 and say process variance after the change is less.

Often as analysts we are asked to determine if differences in sample averages are statistically significant or not.

If we have 2 groups we conduct a t test.

Below are study hours for 6 female and 5 males. Is average study time different by gender?

First perform an F test to see if variance between groups is different or not. This determines the type of t test we do.

We cannot reject unequal variance; this tells us which type of t test to use. Select Data Analysis on Data Tab

On the Pop up select 2 samples assuming equal variance. Click OK.

On average females in our test group study more than males. But can we reject the null hypothesis and say females study more or less than males?

Since the t stat calculated (1.36) is less than the t critical 2 tail (2.26) we can’t say females study more or less than males. If we did there would be a 20.5% chance of error.

Since the t stat calculated (1.36) is less than the t critical 1 tail (1.83) we can’t say females study more than males. If we did there would be a 10.2% chance of error.

Sometimes, we need to test for differences in means across more than 2 groups. We use ANOVA in Excel.

Real Estate Agent, Architect and Stockbrokers were asked to report their degree of job-related stress. Below  is the Excel file with 3 of the groups’ data:

Click on the DATA tab and select DATA ANALYSIS.  In the Pop up select “Single factor” since we are only considering one factor (Stress)

In the Pop up “Input Range” highlight the entire range of data.  Be sure to include the labels (row 1) and click on “Labels in First Row.”

Specify critical level .05.

Real estate agents tested had the highest stress. But, the results are not significant because the calculated F (1.19) is less than the Critical F (3.2).

The last type of continuous distribution we will look at is an exponential distribution.

An exponential distribution arises naturally when modeling the time between independent events that happen at a constant average rate.

That sounds a lot like a Poisson Distribution….. how is an exponential distribution different?

The Poisson distribution models the average number of occurrences in a certain fixed time (µ). It is a discrete distribution, taking on values 0,1,2,…0,1,2,….

The exponential distribution models expected time (λ=1/µ) between events. It is a continuous distribution.

In our factory example, the average number of defects per hour was µ=1.8 (a Poisson distribution)

The mean time between defects is λ =1/ µ =.56 hours per defect (Exponential distribution)

For a ride at Disney world, the mean time to wait in line is 22 min. What is the probability of waiting ≤ 15 min?

A mechanic installs 3 mufflers per hour. What is the probability the time to install a muffler will be ½ hr. or less?

In Excel, probability install time (X) will be ≤ t (0.5 hrs) given we can install 3 per hour is:

There is a 15.3% chance a muffler can be installed in 0.5 hrs.

Why is it called an exponential distribution?