Central Limit Theorem

When analyzing data, studying a huge population as a whole may prove to be a tedious task. So instead, a sub-set of data points are drawn from the population, referred to as Sample.

Central Limit Theorem states that the mean of samples (size > 30) follows a normal distribution, no matter what the distribution of the underlying population is (exponential, uniform etc.)

Properties of Sampling Distribution

The mean of sampling distribution is close to the mean of the underlying population. As the size of the sample increases ; mean of the sample becomes “almost” equal to mean of the underlying population. This is because taking larger values removes sample bias by accommodating the diversity that is inherent in the population.

Depiction of Central Limit Theorem

Generate a random exponential distribution

Plot the mean of 1000 samples from the population. Size of each sample is 30

How does a sample size of less than or greater than 30 impact the distribution of Sample means.

As the sample size increases, the distribution of sample means resemble a normal distribution

As evident from the above diagram, the plot resembles a close to perfect normal distribution for sample size ≥ 30.

~ github.com/hodasaiful. *Equity trade lifecycle * Python * SQL * Data Analytics* Programming*