Calculate Outlier Formula: A Step-By-Step Guide | Outlier (2024)

In This Article

  1. What Is the Outlier Formula?

  2. What Are Q1, Q3, and IQR?

  3. How to Find Outliers in a Data Set

  4. Examples of Outlier Formula

  5. Calculate Outliers Using Statistical Software

  6. FAQs About the Outlier Formula

Don't Overpay For College Statistics

Take Intro to Statistics Online with Outlier.org

From the co-founder of MasterClass, earn transferable college credits from the University of Pittsburgh (a top 50 global school). The world's best online college courses for 50% less than a traditional college.

Get started

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (1)

What Is the Outlier Formula?

The outlier formula — also known as the 1.5 IQR rule — is a rule of thumb used for identifying outliers. Outliers are extreme values that lie far from the other values in your data set.

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (2)

The outlier formula designates outliers based on an upper and lower boundary (you can think of these as cutoff points). Any value that is 1.5 x IQR greater than the third quartile is designated as an outlier and any value that is 1.5 x IQR less than the first quartile is also designated as an outlier.

How to identify outliers using the outlier formula:

Anything above Q3 + 1.5 x IQR is an outlier
Anything below Q1 - 1.5 x IQR is an outlier

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (3)

Intro to Statistics

How data describes our world.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (4)

Intro to Statistics

How data describes our world.

Explore course

What Are Q1, Q3, and IQR?

To use the outlier formula, you need to know what quartiles (Q1, Q2, and Q3) and the interquartile range (IQR) are.

Quartiles (Q1, Q2, Q3) divide a data set into four groups, each containing about 25% (or a quarter) of the data points. There are three quartiles: Q1, Q2, and Q3. Q1 (also known as the first quartile or lower quartile) is the 25th percentile of the data. Q2 (the second quartile) is the 50th percentile or median of the data. Q3 (the third or upper quartile) is the 75th percentile of the data.

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (5)

The Interquartile Range (IQR) is the distance between the first and third quartile. Subtract the first quartile from the third quartile to find the interquartile range.

IQR = Q3 - Q1

How to Find Outliers in a Data Set

Now that you know what quartiles and the interquartile range are, let’s go through a step-by-step example of using the outlier equation. We’ll use a sample data set containing just 10 data points for this example.

Sample Data (n=10)
27, 2, 22, 29, 19, 30, 32, 59, 52, 35

Step 1

Arrange the data in order from smallest to largest.

Step 2

Find the first quartile, Q1.

To find Q1, multiply 25/100 by the total number of data points (n). This will give you a locator value, L. If L is a whole number, take the average of the Lth value of the data set and the (L+1)th(L +1)^{th}(L+1)th value. The average will be the first quartile. If L is not a whole number, round L up to the nearest whole number and find the corresponding value in the data set. That will be the first quartile.

L = (25/100)(n)= (0.25)(10) = 2.5

2.5 is not a whole number, so round up the nearest whole number to get 3. The 3rd value in the data set is 22. Q1 = 22

​​Step 3

Find the third quartile, Q3.

To find Q3, use the same method used to find Q1, except this time, multiply 75/100 by n to get the locator value, L.

L = (75.100)(n) = (0.75)(10) = 7.5

7.5 is not a whole number, so round up the nearest whole number to get 8. The 8th value in the data set is 35. Q3 = 35

Step 4

Find the interquartile range, IQR.

Remember, the interquartile range is the difference between Q3 and Q1.

IQR = Q3 - Q1 = 35 - 22 = 13

Step 5

Find the upper boundary.

Upper boundary = Q3 + 1.5 IQR = 35 + (1.5)(13) = 54.5

Step 6

Find the lower boundary.

Lower boundary = Q1 - 1.5 IQR = 22 - (1.5)(13) = 2.5

Step 7

Identify the outliers.

The outliers are any data points that lie above the upper boundary or below the lower boundary. In this case, the outliers are 2 and 59.

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (6)

Examples of Outlier Formula

Here are three more examples. See if you can identify outliers using the outlier formula.

Example 1

The data below shows a high school basketball player’s points per game in 10 consecutive games. Use the outlier formula and the given data to identify potential outliers.

Points Per Game
15, 24, 33, 48, 28, 21, 22, 51, 30, 31

Example 2

The data below shows the number of daily visitors to a museum. Use the given data and outlier formula to identify potential outliers.

Daily Visitors
732, 680, 815, 720, 693, 503, 740, 670

Example 3

The data below shows the annual rainfall in a tropical rainforest. For ease, the data are already arranged from least to greatest. Use the given data and outlier formula to identify potential outliers.

Annual Rainfall (in.)
80, 93, 108, 112, 220, 225, 230, 233, 239, 245,
249, 251, 258, 273, 320, 349, 387, 410, 420, 429

Solution for Example 1

Outliers: 51. Q1 = 22, Q3 = 33, IQR = 11, lower boundary = 5.5, upper boundary = 49.5

Solution for Example 2

Outliers: 503. Q1 = 675, Q3 = 736, IQR = 61, lower boundary = 583.5, upper boundary = 827.5.

Note that there are only 8 data points (n=8). When calculating Q1 and Q3, the locator value L is a whole number.

To find Q1, you need to take the average of the 2nd and 3rd values of the data set. To find Q3, you need to take the average of the 6th and 7th values.

Solution for Example 3

There are no outliers in this data set. Q1 = 220, Q3 = 320, IQR = 100, lower boundary = 70, upper boundary = 470

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (7)

Intro to Statistics

How data describes our world.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (8)

Intro to Statistics

How data describes our world.

Explore course

Calculate Outliers Using Statistical Software

While it’s important to know what the outlier formula is and how to find outliers by hand, more often than not, you will use statistical software to identify outliers.

Follow these steps to use the outlier formula in Excel, Google Sheets, Desmos, or R.

Note that there are several accepted ways to calculate quartiles. Some of the software below uses different approaches to calculating quartiles than what we used in the examples above. Don’t worry. The difference in the calculations won’t be enough to alter your results significantly.

1. In Excel or Google Sheets

You can use the Outlier formula in Excel or Google sheets using the following steps.

  • To find the first quartile use the formula =QUARTILE(Data Range; 1)

For example, if your data is in cells A2 through A11, you would type =QUARTLE(A2:A11, 1)

  • To find the third quartile use the formula =QUARTILE(Range; 3)

For example, if your data is in cells A1 through A10, you would type =QUARTLE(A2:A11, 3)

  • Subtract Q3 from Q1

  • Calculate the upper boundary: Q3 + (1.5)(IQR)

  • Calculate the lower boundary: Q1 - (1.5)(IQR)

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (9)

2. In Desmos

You can use the Outlier formula in Desmos using the following steps.

  • Create a table and input your data in the x1 column.

  • Use the function stats(x1) to find Q1 and Q3 for your data.

  • Subtract Q1 from Q3 to get the interquartile range.

  • Calculate the upper boundary: Q3 + (1.5)(IQR)

  • Calculate the lower boundary: Q1 - (1.5)(IQR)

3. In R

You can use the Outlier formula in R using the following steps.

  • Save your data using the assign operator, < -, and the combine function c(). Give the data a name like mydata.

For example, say your data consists of the following values (15, 21, 25, 29, 32, 33, 40, 41, 49, 72).

Type: mydata <-c(15, 21, 25, 29, 32, 33, 40, 41, 49, 72)

  • Use the summary function to find Q1 and Q3. Type: summary(mydata)

  • Use the IQR function to find the interquartile range. Type: IQR(mydata)

  • Calculate the upper boundary: Q3 + (1.5)(IQR)

  • Calculate the lower boundary: Q1 - (1.5)(IQR)

For practice, try using one or more of these programs to find the outliers from the examples we covered in the previous section.

FAQs About the Outlier Formula

Here are some frequently asked questions about the outlier formula.

When should I remove outliers?

There isn’t a clear and fast rule about when you should (or shouldn’t) remove outliers from your data. Outliers can occur for different reasons. Sometimes, outliers result from an error that occurred during the data collection process. If it’s obvious that an outlier results from a data collection error, it’s safe to remove it. You might also choose to re-measure the data point if you can.

If you’re not sure if an outlier results from an error, your first instinct shouldn’t be to remove it. The outlier may provide some important insights about your data, and if you remove it, those insights will be lost. A better solution would be to adjust your method of analysis and to think carefully about why the outlier exists. You might also choose to run your analysis with and without the outlier and present both sets of results for the sake of transparency.

Can there be a negative outlier?

Yes. If your data contains negative values, outliers can be negative numbers.

How does removing the outlier affect the mean?

The mean of the data set is sensitive to outliers, so removing an outlier can dramatically change the value of the mean. If you remove a positive outlier, the mean will decrease. If you remove a negative outlier, the mean will increase.

How does removing outliers affect the median?

The median of the data set is resistant to outliers, so removing an outlier shouldn’t dramatically change the value of the median. After removing an outlier, the value of the median can change slightly, but the new median shouldn’t be too far from its original value.

Can normal distributions have outliers?

Yes. Values that lie in a normal distribution’s extreme right and left tails can be considered outliers. You can use Z-scores to identify outliers in a normal distribution. If you apply the outlier formula, any value in a normal distribution with a Z-score above 2.68 or below -2.68 should be considered an outlier.

For more on normal distribution, Duke University's Dr. Olanrewaju Michael Akande gives an overview.

Can a data set have more than one outlier?

Yes. It’s possible to have more than one outlier in your data.

Is the outlier formula the only method of identifying outliers?

No. The outlier formula is a commonly used and straightforward method, but there are other ways to identify outliers. Statisticians will often plot their data on graphs such as box plots and scatterplots to identify outliers. They may also use regression, hypothesis testing, and Z-scores to identify outliers.

Explore Outlier's Award-Winning For-Credit Courses

Outlier (from the co-founder of MasterClass) has brought together some of the world's best instructors, game designers, and filmmakers to create the future of online college.

Check out these related courses:

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (10)

Intro to Statistics

How data describes our world.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (11)

Intro to Statistics

How data describes our world.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (12)

Intro to Microeconomics

Why small choices have big impact.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (13)

Intro to Microeconomics

Why small choices have big impact.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (14)

Intro to Macroeconomics

How money moves our world.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (15)

Intro to Macroeconomics

How money moves our world.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (16)

Intro to Psychology

The science of the mind.

Explore course

Calculate Outlier Formula: A Step-By-Step Guide | Outlier (17)

Intro to Psychology

The science of the mind.

Explore course
Calculate Outlier Formula: A Step-By-Step Guide | Outlier (2024)

FAQs

Calculate Outlier Formula: A Step-By-Step Guide | Outlier? ›

What is the 1.5 IQR rule? This is a rule which uses the Quartiles and IQR of a data set to determine the upper and lower fence of a data set. Any data point which lies beyond these fences is considered an outlier. The formula for the upper fence is Q3 + 1.5(IQR), and the formula for the lower fence is Q1 - 1.5(IQR).

What is the formula to find the outlier? ›

What is the 1.5 IQR rule? This is a rule which uses the Quartiles and IQR of a data set to determine the upper and lower fence of a data set. Any data point which lies beyond these fences is considered an outlier. The formula for the upper fence is Q3 + 1.5(IQR), and the formula for the lower fence is Q1 - 1.5(IQR).

How do you find the outlier of a 6th grade data set? ›

To find the outlier, look for the value that is either much larger or much smaller than all the other values. The value 2 is an outlier because it is much smaller than all other values.

What is the outlier in the following set: 28 31 40 26 38 65 39 35 30? ›

The number 65 is the outlier in the set 28, 31, 40, 26, 38, 65, 39, 35, 30, as it is significantly higher than the rest of the numbers in the data set.

What is an outlier and how do you find it? ›

An outlier is an observation that lies an abnormal distance from other values in a random sample from a population. In a sense, this definition leaves it up to the analyst (or a consensus process) to decide what will be considered abnormal.

What is an example of an outlier? ›

When a value is called an outlier it usually means that that value deviates from all other values in a data set. For example, in a group of 5 students the test grades were 9, 8, 9, 7, and 2. The last value seems to be an outlier because it falls below the main pattern of the other grades.

What is the formula for an outlier in a level maths? ›

First, calculate the interquartile range and multiply it by 1.5. Then, subtract this value from the 1st quartile and then also add it to the 3rd quartile. The two values that you end up with are the acceptable statistical data range. Any data point outside this would be an outlier.

How to calculate outliers with IQR? ›

We can use the IQR method of identifying outliers to set up a “fence” outside of Q1 and Q3. Any values that fall outside of this fence are considered outliers. To build this fence we take 1.5 times the IQR and then subtract this value from Q1 and add this value to Q3.

How to find outliers formula? ›

The Z-value helps to identify the outliers. Z = (x - μ)/ σ where μ is the mean of the data and σ is the standard deviation of the data. The data with Z-values beyond 3 are considered as outliers.

What is the first step to identifying outliers in a data set? ›

The first step to identifying outliers in a data set is to create parameters based on statistical methods such as the interquartile range or z-scores. Once the parameters are established, data points outside the defined range can be flagged as outliers.

What is an outlier in a math equation? ›

An outlier is a value in a data set that is very different from the other values. That is, outliers are values unusually far from the middle.

What is the outlier in the following set 29 2 28 30 26 31? ›

You can see that in this set, most of the numbers are between 26 and 31, with the exeption of the number 2. this means that the number 2 is the outlier, because there is no other number close to 2.

Why do we use 1.5 IQR for outliers? ›

The interquartile (IQR) method of outlier detection uses 1.5 as its scale to detect outliers because it most closely follows Gaussian distribution. As a result, the method dictates that any data point that's 1.5 points below the lower bound quartile or above the upper bound quartile is an outlier.

Which method can used to identify outliers? ›

IQR method

IQR method is used by box plot to highlight outliers. IQR stands for interquartile range, which is the difference between q3 (75th percentile) and q1 (25th percentile). The IQR method computes lower bound and upper bound to identify outliers.

Top Articles
Latest Posts
Article information

Author: Amb. Frankie Simonis

Last Updated:

Views: 5830

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Amb. Frankie Simonis

Birthday: 1998-02-19

Address: 64841 Delmar Isle, North Wiley, OR 74073

Phone: +17844167847676

Job: Forward IT Agent

Hobby: LARPing, Kitesurfing, Sewing, Digital arts, Sand art, Gardening, Dance

Introduction: My name is Amb. Frankie Simonis, I am a hilarious, enchanting, energetic, cooperative, innocent, cute, joyous person who loves writing and wants to share my knowledge and understanding with you.