
CA Foundation Quantitative Aptitude Theoretical Distribution is an important chapter that helps students understand how probability works in real-life situations. This topic builds a strong base for solving questions related to uncertainty, chance, and risk, which are commonly asked in the CA Foundation exam. Most questions from this chapter are formula-based and test the student’s ability to apply concepts correctly.
The chapter mainly focuses on theoretical distributions such as Binomial, Poisson, and Normal distribution. Students learn how probabilities are distributed when outcomes follow a fixed rule rather than actual observations. Understanding random variables, parameters like mean and variance, and correct use of formulas is the key to scoring well.
This chapter on Theoretical Distribution builds upon probability concepts to analyze uncertainty, essential for understanding risk in competitive exams. It focuses on formula-based questions, often requiring direct calculation of measures like mean or variance for various distributions. Mastering formulas and their application is key to scoring well.
Before studying theoretical distributions, students must understand the difference between frequency distribution and probability distribution.
A frequency distribution shows how many times an event occurs.
A probability distribution shows how total probability (which is always 1) is distributed among possible outcomes.
A random variable is the quantity whose values depend on chance. It forms the base of every probability distribution. Examples include:
Number of heads in coin tosses
Number of accidents per day
Height, weight, or marks of students
Discrete Random Variable
Takes countable values such as 0, 1, 2, etc.
Example: Number of defective items.
Continuous Random Variable
Takes any value within a range and is measured.
Example: Height or temperature.
Discrete variables use Probability Mass Function (PMF), while continuous variables use Probability Density Function (PDF).
A theoretical distribution is based on expected probability rules, while an observed distribution comes from actual experimental data.
For example, the theoretical probability of two heads in two coin tosses is 1/4, but actual results may differ in real experiments.
This chapter primarily covers three key distributions:
Binomial Distribution (Discrete)
Poisson Distribution (Discrete)
Normal Distribution (Continuous)
The Binomial Distribution is a discrete probability distribution, developed by Jacob Bernoulli. It's also called the Bernoulli Distribution, with individual trials referred to as Bernoulli Trials.
Experiments modeled by the Binomial Distribution have these characteristics:
Fixed number of trials (n).
Each trial has exactly two outcomes: success or failure.
Success is the outcome of interest.
Trials are independent.
Probability of success (p) remains constant for every trial.
n: The number of trials.
p: The probability of success in a single trial.
q: The probability of failure in a single trial.
A fundamental relationship is p + q = 1.
The probability of exactly 'r' successes in 'n' trials is given by:
P(X = r) = ⁿCᵣ * pʳ * qⁿ⁻ʳ
Where:
r is the desired number of successes (0 ≤ r ≤ n).
ⁿCᵣ accounts for the number of ways r successes can occur in n trials.
The formula logically combines the probability of r successes (pʳ), n-r failures (qⁿ⁻ʳ), and the combinations of arrangements (ⁿCᵣ). For example, if n=5 and r=2, then P(X=2) = ⁵C₂ * p² * q³.
The sum of probabilities of all possible outcomes (0 to n successes) in a binomial experiment is always 1.
P(X=0) + P(X=1) + … + P(X=n) = 1
To find the probability of at least one success (P(X ≥ 1)), use the shortcut:
P(X ≥ 1) = 1 - P(X = 0)
(Memory Tip: To avoid lengthy calculations for 'at least one', calculate the opposite ('zero success') and subtract from 1.)
Given: n=5, p=0.8, q=0.2.
Find: P(X ≥ 1).
Solution:
P(X=0) = ⁵C₀ * (0.8)⁰ * (0.2)⁵ = 1 * 1 * 0.00032 = 0.00032
P(X ≥ 1) = 1 - 0.00032 = 0.99968
A distribution is Binomial if it meets these conditions:
Fixed Number of Trials (n): Must be finite and fixed.
Two Outcomes: Each trial results in either Success or Failure.
Independent Trials: Outcome of one doesn't affect others.
Constant Probability of Success (p): p (and q) remains constant.
Counting Successes: X represents the total number of successes.
The Binomial Distribution is biparametric, defined by n and p.
Notation: X ~ B(n, p).
Example: X ~ B(6, 1/3) means n=6, p=1/3 (so q=2/3).
"At least one" (X ≥ 1): One or more successes. Often solved with 1 - P(X=0).
"At most three" (X ≤ 3): Zero, one, two, or three successes. Requires summing P(X=0) + P(X=1) + P(X=2) + P(X=3).
Problem: For X ~ B(6, p), if 4 * P(X=4) = P(X=2), find p.
Solution:
4 * [⁶C₄ * p⁴ * q²] = [⁶C₂ * p² * q⁴]
4 * [15 * p⁴ * q²] = [15 * p² * q⁴]
Simplify: 4 * p² = q²
Substitute q = 1 - p: 4p² = (1 - p)²
Take square root: 2p = 1 - p => 3p = 1 => p = 1/3.
For B(n, p):
Mean (μ): np
Variance (σ²): npq
Standard Deviation (σ): √npq
Property: For any binomial distribution (where q>0), Mean > Variance (since np > npq).
Maximum Variance: Occurs when p = q = 0.5, with a value of n/4.
If X ~ B(n₁, p) and Y ~ B(n₂, p) are independent with the same p, then (X + Y) ~ B(n₁ + n₂, p).
Problem: For a binomial distribution, mean = 10, SD = √5. Find the mode.
Solution:
np = 10, npq = 5.
q = npq / np = 5 / 10 = 0.5.
p = 1 - q = 0.5.
n = np / p = 10 / 0.5 = 20.
(n+1)p = (20+1) \* 0.5 = 10.5.
Since 10.5 is a non-integer, the Mode = 10.
Problem: An experiment succeeds three times as often as it fails. Repeated five times. What is the probability of no success?
Solution:
p = 3q. Since p + q = 1, 3q + q = 1 => 4q = 1 => q = 1/4.
Thus, p = 3/4.
n = 5. We need P(X=0).
P(X=0) = ⁵C₀ * (3/4)⁰ * (1/4)⁵ = 1 * 1 * (1/1024) = **1/1024**.
The Poisson Distribution is used for rare events where:
Number of trials is very large
Probability of success is very small
Examples include accidents, printing errors, or system failures.
P(X = r) = (e⁻ᵐ × mʳ) / r!
Here, m is the mean.
Mean = Variance = m
If m is non-integer → mode is the integral part of m
If m is integer → modes are m and m − 1
1. No Accidents: Average accidents per day m = 2. Find P(X=0).
P(X=0) = e⁻² ≈ **0.1353**
2. Finding the Mean: Given P(X=2) = P(X=3). Find m.
(e⁻ᵐ * m²) / 2! = (e⁻ᵐ * m³) / 3!
1 / 2 = m / 6 => m = **3**.
The mode depends on whether the mean m is an integer:
| Case | Nature of m | Number of Modes | Value of Mode(s) |
|---|---|---|---|
| 1 | m is a non-integer | Unimodal | Integral part of m. |
| 2 | m is an integer | Bimodal | m and m-1. |
If X ~ P(m₁) and Y ~ P(m₂) are independent Poisson variables, then (X + Y) ~ P(m₁ + m₂).
1% of flights have minor equipment failure. Probability of exactly two failures in 100 flights?
This is Poisson (n large, p small).
m = np = 100 * 0.01 = 1.
P(X=2) = (e⁻¹ * 1²) / 2! = 1 / (2e) ≈ **0.1839**.
The Normal Distribution is used for continuous random variables. It's also called the Gaussian Distribution, after Carl Friedrich Gauss.
Shape: Bell-shaped and perfectly symmetric.
Symmetry: Skewness is zero.
Measures of Central Tendency: At the peak, Mean = Median = Mode.
Asymptotic Tails: Extend from -∞ to +∞, approaching but never touching the x-axis.
Area Under the Curve: Total area is 1 (or 100%). Area to the left/right of mean is 0.5.
It is biparametric:
Mean (μ): Locates the center of the curve.
Variance (σ²): Determines the spread.
Notation: X ~ N(μ, σ²).
Mean Deviation (MD): MD ≈ 0.8 * σ
Quartile Deviation (QD): QD ≈ 0.675 * σ
Quartiles: Q₁ = μ - 0.675σ, Q₃ = μ + 0.675σ
Points of Inflection: μ - σ and μ + σ
If X ~ N(μ₁, σ₁²) and Y ~ N(μ₂, σ₂²) are independent, then (X + Y) ~ N(μ₁ + μ₂, σ₁² + σ₂²). Means and variances add up.
Mean = 50, SD = 10. Find Median.
Median = Mean = 50.
Points of Inflection are 6 and 14. Find SD.
μ - σ = 6, μ + σ = 14. Adding gives 2μ = 20 => μ = 10.
Substitute μ=10: 10 - σ = 6 => σ = **4**.
A special case with fixed parameters:
Mean (μ) = 0
Standard Deviation (σ) = 1
Variance (σ²) = 1
Properties are constant: MD = 0.8, QD = 0.675, Q₁ = -0.675, Q₃ = 0.675, Points of Inflection = -1 and +1.
The percentages of data within standard deviation intervals are fixed:
μ ± 1σ → 68.27%
μ ± 2σ → 95.45%
μ ± 3σ → 99.73%
Detailed Breakdown: Area between μ and μ + 1σ is 34.135%; between μ + 1σ and μ + 2σ is 13.59%; between μ + 2σ and μ + 3σ is 2.14%. These are symmetric around the mean.
To standardize values, convert X to Z:
Z = (X - μ) / σ
At X=μ, Z=0.
At X=μ+1σ, Z=1.
At X=μ-1σ, Z=-1, etc.
Φ(k) = P(Z < k), representing the total probability to the left of a given Z-value.
Φ(1) = P(Z < 1) = P(Z<0) + P(0<Z<1) = 0.50 + 0.34135 = 0.84135.
Properties:
Φ(-k) = 1 - Φ(k)
P(a < Z < b) = Φ(b) - Φ(a)
P(Z > k) = 1 - Φ(k)
The Z-table typically gives the area between the mean (Z=0) and a positive Z-value.
Example: To find P(0 < Z < 1.12), look up Z=1.12 in the table. If it's 0.3686, then P(0 < Z < 1.12) = 0.3686.
Steps:
Identify μ and σ.
Convert X-value(s) to Z-score(s) using Z = (X - μ) / σ.
Calculate probability using Z-table/properties.
Multiply by total sample size if asked for the number of items.
Worked Example: Inverse Problem (Finding SD)
Problem: μ=500. 16% of values > 600. Find σ.
Solution:
P(X > 600) = 0.16. This means P(Z > (600-500)/σ) = 0.16.
If 16% is to the right, Φ(Z) (area to the left) = 1 - 0.16 = 0.84.
Φ(Z) = 0.84 implies 0.50 + P(0 < Z < Z_value) = 0.84.
So, P(0 < Z < Z_value) = 0.34. From Z-table, P(0 < Z < 1) is approx 0.3413. Thus, Z_value = 1.
Set (600 - 500) / σ = 1 => 100 / σ = 1 => σ = 100.
Problem: N(μ=500, σ=120). Find K such that P(500 < X < K) = 0.4032.
Solution:
Convert X values to Z: Z₁ = (500-500)/120 = 0, Z₂ = (K-500)/120.
P(0 < Z < Z₂) = 0.4032.
From Z-table, if P(0 < Z < z) = 0.4032, then z = 1.30 (this value is given in problem context).
So, (K-500)/120 = 1.30.
K - 500 = 1.30 * 120 = 156.
K = 500 + 156 = **656**.