Central Limit Theorem (Distribution of Averages)

Assume that there are n independent and identically distributed variables and each of the variables has the same probability distribution as the others and all are mutually independent. The Central Limit Theorem states that the mean of such variables will approach a normal distribution as the number of observations increases.  Examples of normally distributed variables are Intelligent Quotient’s, manufacturing processes, weights to name a few.

Assume that we have a set of variables where each variable has a means µ and standard deviation.  The mean of the value of x is defined as 1ninXi-\frac{1}{n}\sum_{i}^{n}X_{i} .

XˉN(μ,σ2n)\bar{X}\rightarrow N(\mu,\frac{\sigma ^{2}}{n} )

The following equation standardizes the variable.

XˉμσnN(0.1)\frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \to N(0.1)

Thus we can say that the normal variable is one which forms when the limit of the number of variables approaches or is more than a certain number.

In order to make the evaluation of the binomial distribution easier we can use a case where it is a sum of independent Bernoulli trials. As the number of independent Bernoulli trials gets larger the CLT approximates to a normal distribution.

zxpnp(1p)nN(0,1)z\frac{x-pn}{\sqrt{p(1-p)n}} \to N(0,1)

Video on Central Limit Theorem

Video by Khan Academy

Membership
Learn the skills required to excel in data science and data analytics covering R, Python, machine learning, and AI.
I WANT TO JOIN
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Saylient AI Logo

Take the Next Step in Your Data Career

Join our membership for lifetime unlimited access to all our data analytics and data science learning content and resources.