Central Limit Theorem (Distribution of Averages)

Assume that there are n independent and identically distributed variables and each of the variables has the same probability distribution as the others and all are mutually independent. The Central Limit Theorem states that the mean of such variables will approach a normal distribution as the number of observations increases.  Examples of normally distributed variables are Intelligent Quotient’s, manufacturing processes, weights to name a few.

Assume that we have a set of variables where each variable has a means µ and standard deviation.  The mean of the value of x is defined as 1n_inX_i-\frac{1}{n}\sum\_{i}^{n}X\_{i} .

XˉN(μ,σ2n)\bar{X}\rightarrow N(\mu,\frac{\sigma ^{2}}{n} )

The following equation standardizes the variable.

XˉμσnN(0.1)\frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \to N(0.1)

Thus we can say that the normal variable is one which forms when the limit of the number of variables approaches or is more than a certain number.

In order to make the evaluation of the binomial distribution easier we can use a case where it is a sum of independent Bernoulli trials. As the number of independent Bernoulli trials gets larger the CLT approximates to a normal distribution.

zxpnp(1p)nN(0,1)z\frac{x-pn}{\sqrt{p(1-p)n}} \to N(0,1)

Video on Central Limit Theorem

This content is for paid members only.

Join our membership for lifelong unlimited access to all our data science learning content and resources.