What is the Empirical Rule in Statistics?

Formula

The formula shows the predicted percentage of observations within each Standard DeviationStandard DeviationStandard deviation (SD) is a popular statistical tool represented by the Greek letter ‘σ’ to measure the variation or dispersion of a set of data values relative to its mean (average), thus interpreting the data’s reliability.read more from the MeanMeanMean refers to the mathematical average calculated for two or more values. There are primarily two ways: arithmetic mean, where all the numbers are added and divided by their weight, and in geometric mean, we multiply the numbers together, take the Nth root and subtract it with one.read more.

The Rule says that:

  • 68% of the observations will lie within +/- 1 standard deviation from the mean95% of the observations will lie within +/- 2 standard deviations from the mean7% of the observations will lie within +/- 3 standard deviations from the mean

You are free to use this image on you website, templates, etc., Please provide us with an attribution linkHow to Provide Attribution?Article Link to be HyperlinkedFor eg:Source: Empirical Rule (wallstreetmojo.com)

How to Use?

One may use it in the forecasting trend of a data set. When the data set is extensive, it gets challenging to study the entire population. For example, one can apply the empirical rule to the sample to get an estimation of how the data in the population will react if asked to find the average salary of all the accountantsThe AccountantsAn accountant is a finance professional responsible for recording business transactions on behalf of a firm, reporting the firm’s performance and issuing financial statements. Thus, an accountant plays an important role whether it is a small domestic entity or a large multinational company.read more in the US. Then that is a difficult task to perform as the population set is enormous. So, in that case, you can select, say, 90 observations randomly from the entire population.

So now you will have 90 salaries. Next, you need to find the mean and standard deviation of the observations. If the observation follows a normal distribution, one can apply thisied, and one can estimate the salary of all accountants in the US.

Say the mean salary of the sample comes out to be $90,000. And the standard deviation is $5,000. Out of the entire population, 68% of the accountants are drawing a salary ranging between +/- 1 standard deviation from the mean. The mean is $90,000, and the standard deviation is $5,000. So, 68% of all the accountants in the US paid in the range of $90,000 +/- (1*$5,000). That is within $85,000 to $95,000.

If we spread a bit more, then 95% of all the accountants in the US are being paid in the range of mean +/- 2 standard deviations. $90,000 +/- (2*5000). So the range is $80,000 to $100,000.

In a broader range, 99.7% of all accountants draw salaries ranging from mean +/- 3 standard deviations. That is 90,000 +/- (3*5000). The range is $75,000 to $105,000.

You can see that without studying the entire population, one could estimate the population. For example, if someone plans to work as an accountant in the US, he can easily expect his salary to range from $75,000 to $105,000.

This kind of estimation helps to ease work and make forecasts regarding the future.

Empirical Rule Examples

Mr. X is trying to find the average number of years a person survives after retirement, considering the retirement age to be 60. If the mean survival years of 50 random observations are 20 years and SD is 3, then determine the probability that a person will draw a pension for more than 23 years.

Solution

The empirical rule states that 68% of the observations will lie within 1 standard deviation from the mean. Here, the mean of the observations is 20.

68% of the observations will lie within 20 +/- 1 (standard deviation), which is 20 +/- 3. So, the range is 17 to 23.

There is a 68% chance that the minimum years a person survives after retirement lies between 17 to 23. The percentage lying outside this range is (100 – 68) = 32%. Therefore, 32 is distributed equally on both sides, which means a 16% chance that the minimum years will be below 17 and a 16% chance that minimum years will be greater than 23.

So, the probability that the person will draw more than 23years of pension is 16%.

Empirical Rule vs. Chebyshev’s Theorem

The empirical rule applies to data sets that follow a normal distributionNormal DistributionNormal Distribution is a bell-shaped frequency distribution curve which helps describe all the possible values a random variable can take within a given range with most of the distribution area is in the middle and few are in the tails, at the extremes. This distribution has two key parameters: the mean (µ) and the standard deviation (σ) which plays a key role in assets return calculation and in risk management strategy.read more. That means bell-shaped. In a normal distribution, both sides have a 50% probability each.

Chebyshev’s theorem applies another approximation or rule to all types of data sets if the data set is distributed not normally. It says three things:

  • At least 3/4th of all the observations will lie within 2 standard deviations from the mean. It is a strong approximation. For example, it means if there are 100 observations, then 3/4th of the observations that are 75 observations will lie within +/- 2 standard deviations from the mean.At least 8/9th of all observations will lie within 3 standard deviations from the mean.At least 1 – 1/k^2 of all the observations lie within K standard deviations from the mean. Here, K is referred to as any whole number.

When to Use?

Data is like gold in the modern world. Huge data flow from different sources is used for different approximations or forecasts. For example, if a data set follows a normal distribution, it shows a Bell Shaped curveBell Shaped CurveBell Curve graph portrays a normal distribution which is a type of continuous probability. It gets its name from the shape of the graph which resembles to a bell. read more. Then, one can use the empirical rule. Finally, it is applied to observations to create an approximation for the population.

Once the observations show a normal distribution structure, one may follow an empirical rule to find several observations’ probabilities. The rule is extremely useful for many statistical forecasts.

Conclusion

The empirical rule is a statistical concept that helps portray the probability of observations and is very useful when finding an approximation of a huge population. However, one should always note that these are approximations. There are always chances of outliers that don’t fall in the distribution. So, the findings are not accurate and precautionary measures should be taken when acting as per the forecast.

This article has been a guide to What Empirical Rule is & its definition. Here, we discuss the empirical rule’s formula and calculation examples. You can learn more about it from the following articles: –

  • Uniform DistributionSkewnessCentral TendencyLog-Normal Distribution