Central Limit Theorem
When analyzing data, studying a huge population as a whole may prove to be a tedious task. So instead, a sub-set of data points are drawn from the population, referred to as Sample.
Central Limit Theorem states that the mean of samples (size > 30) follows a normal distribution, no matter what the distribution of the underlying population is (exponential, uniform etc.)
Properties of Sampling Distribution
The mean of sampling distribution is close to the mean of the underlying population. As the size of the sample increases ; mean of the sample becomes “almost” equal to mean of the underlying population. This is because taking larger values removes sample bias by accommodating the diversity that is inherent in the population.
Generate a random exponential distribution
Plot the mean of 1000 samples from the population. Size of each sample is 30
How does a sample size of less than or greater than 30 impact the distribution of Sample means ?
As the sample size increases, the distribution of sample means resemble a normal distribution
As above, the plot resembles a close to perfect normal distribution for sample size ≥ 30.