Normal Distribution and CLT in Data Science

By Mahesh Kankrale
July 31, 2024
Data Science

Normal Distribution and CLT in Data Science

A normal distribution is a continuous probability distribution with a probability density function that gives you a symmetrical bell curve. Simply put, it ispoint and a few points taper off symmetrically towards two opposite ends. Explore Normal Distribution and CLT in Data Science: Key concepts, their importance in statistical analysis, and applications in real-world data scenarios.

Empirical Rule :

The empirical rule, also known as the 68-95-99.7 rule or the three-sigma rule, is a statistical rule of thumb that describes the approximate percentage of values that fall within a certain number of standard deviations from the mean in a normal distribution:

Approximately 68% of the data falls within one standard deviation of the mean.
Approximately 95% of the data falls within two standard deviations of the mean.
Approximately 99.7% of the data falls within three standard deviations of the mean.

This rule is based on properties of the normal distribution and provides a quick way to estimate the spread of data and identify outliers.

For Free, Demo classes Call: 020-71173143

Registration Link: Click Here!

Central Limit Theorem :

The central limit theorem (CLT) is a fundamental concept in statistics. It tells us about the distribution of averages (means) from samples drawn from a population. Here’s the gist of it:

Large samples: The CLT applies when you take a large enough random sample from a population, regardless of the original shape of the population’s distribution (normal, skewed, etc.).
Sample means become normal: The distribution of the means of those samples will tend towards a normal distribution (a bell-shaped curve) as the sample size increases.
Mean and standard deviation: The average of the sample means will be equal to the population mean, and the standard deviation of the sample means will be related to the population’s standard deviation.

Whatever the form of the population distribution, the sampling distribution tends to a Gaussian, and its dispersion is given by the central limit theorem

Log Normal Distribution :

In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution. Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y) , has a log-normal distribution. A random variable that is log-normally distributed takes only positive real values.

Do watch our Channel to learn more: Click Here

Author:

Mahesh Kankrale

Call the Trainer and Book your free demo Class For Data Science Call now!!!
| SevenMentor Pvt Ltd.

Normal Distribution and CLT in Data Science

Normal Distribution and CLT in Data Science

Empirical Rule :

Central Limit Theorem :

Log Normal Distribution :

Author:

Submit Comment Cancel reply

For More, Follow us on our Social Sites: