The Geometric Distribution is a discrete probability distribution that models the number of Bernoulli trials (i.e., a series of independent yes/no experiments) needed to get a success. It is often used to model the number of failures before the first success in a series of independent trials.
Properties of Geometric Distribution:
The Geometric Distribution is defined by a single parameter: the probability of success (p) in a Bernoulli trial. It has several important properties, including:
- Discrete values: The Geometric Distribution is a discrete probability distribution, meaning that it can only take integer values (e.g., 1, 2, 3, etc.).
- Two possible outcomes: The binomial distribution assumes that each trial has only two possible outcomes: success or failure. The probability of success on each trial is constant across all trials.
- Independence of trials: The trials are independent; the outcome of one trial does not affect the outcome of other trials. This means that the probability of success is the same for each trial.
- Non-negative values: The Geometric Distribution can only take non-negative values, meaning that it can never be less than 0.
- Long-tailed distribution: The Geometric Distribution has a long-tailed distribution, meaning that extreme values (far from the mean) are more likely to occur.
Probability Mass Function (PMF) - Geometric Distribution
The probability mass function (PMF) of the Geometric Distribution gives the probability of a given number of Bernoulli trials occurring before the first success. The formula for the PMF is as follows:
$$\Large{f(x) = (1 - p)^{x-1}p}$$
Where:
- \(f(x)\) is the probability mass function
- \(x\) is the number of Bernoulli trials
- \(p\) is the probability of success in a Bernoulli trial
- \((1 - p)\) is the probability of failure in a Bernoulli trial
Cumulative Distribution Function (CDF)
The cumulative distribution function (CDF) of a probability distribution gives the probability that a random variable takes on a value less than or equal to a certain value. In the case of the Geometric Distribution, the CDF gives the probability that the number of Bernoulli trials needed to get the first success is less than or equal to a certain value.
The formula for the CDF of the Geometric Distribution is as follows:
$$\Large{F(x) = 1 - (1 - p)^x}$$
Where:
- \(F(x)\) is the cumulative distribution function
- \(x\) is the number of Bernoulli trials
- \(p\) is the probability of success in a Bernoulli trial
- \((1 - p)\) is the probability of failure in a Bernoulli trial
For example, if the probability of success in a Bernoulli trial is 0.5, the CDF of the Geometric Distribution would be:
$$F(1) = 1 - (1 - 0.5)^1 = 1 - 0.5 = 0.5 $$
$$F(2) = 1 - (1 - 0.5)^2 = 1 - 0.25 = 0.75 $$
$$F(3) = 1 - (1 - 0.5)^3 = 1 - 0.125 = 0.875$$
This means that the probability that the number of Bernoulli trials needed to get the first success is less than or equal to 1 is 0.5, the probability that it is less than or equal to 2 is 0.75, and the probability that it is less than or equal to 3 is 0.875.
Mean, Standard Deviation and Variance of Geometric Distribution
Mean:
The mean of a Geometric Distribution is calculated as:
$$\Large{\mu = \frac{1}{p}}$$
Where μ is the mean and p is the probability of success in a Bernoulli trial.
For example, if the probability of success in a Bernoulli trial is 0.5, the mean of the Geometric Distribution would be:
μ = 1 / 0.5 = 2
This means that, on average, it would take 2 Bernoulli trials to get the first success.
Standard Deviation:
The standard deviation of a Geometric Distribution is calculated as:
$$\Large{\sigma = \sqrt{\frac{1 - p}{p^2}}}$$
Where σ is the standard deviation and p is the probability of success in a Bernoulli trial.
Variance:
The variance of a Geometric Distribution is calculated as the square of the standard deviation:
$$\Large{variance = \sigma^2 = \frac{1 - p}{p^2 }}$$
Where \(\sigma^2\) is the variance and p is the probability of success in a Bernoulli trial.
Using Excel:
Unfortunately, there is no GEOM.DIST() or GEOM.INV() function in Excel. You will have to manually perform these calculations using the formula for PDF and CDF shown above in this post.
Conclusion
The Geometric Distribution is useful for modelling the number of failures before the first success in a series of independent trials. It has a long-tailed distribution and is defined by a single parameter: the probability of success in a Bernoulli trial.