Quantiles are an important statistical concept that allows us to divide data into equal groups. They are often used to identify and analyze data patterns and make meaningful comparisons between different datasets. In this quick guide, we'll cover the basics of quantiles and take a closer look at some of the most common types: quartiles, deciles, and percentiles.

## Quantile

A quantile is a measure that indicates the value below which a certain proportion of observations in a group of observations fall. Quantiles are used in statistics to divide a group of observations into equal-sized groups. For example, the 0.25 quantile is the value below which 25% of the observations fall; the 0.50 quantile is the value below which 50% falls, and so on. Another related measurement is the Median, which is the same as the 0.50 quantile because 50% of data fall below the median.

## What are some common quantiles?

Some common quantiles include:

## 1. Quartile

A quartile is a type of quantile that divides a group of observations into four equal-sized groups. For example, in a group of observations, the first quartile (Q1) is the value below which the first 25% of the observations fall, the second quartile (Q2, also known as the median) is the value below which the middle 50% of the observations fall, and the third quartile (Q3) is the value below which the last 25% of the observations fall.

## 2. Decile

A decile is a measure that divides a group of observations into ten equal-sized groups. For example, in a group of observations, the first decile (D1) is the value below which the first 10% of the observations fall, the second decile (D2) is the value below which the first 20% of the observations fall, and so on. The 9th decile (D9) is the value below which the last 10% of the observations fall.

## 3. Percentile

A percentile is a measure that indicates the value below which a certain percentage of observations in a group of observations fall. For example, in a group of observations, the 20th percentile (P20) is the value below which the first 20% of the observations fall, the 50th percentile (P50) is the value below which the middle 50% of the observations fall, and the 95th percentile (P95) is the value below which the last 95% of the observations fall.

The 50th percentile is also the median, second quartile, and 5th decile.

## Percentile: Manual / Microsoft Excel Calculation

A percentile is a measure used in statistics to indicate the value below which a certain percentage of observations falls in a group of observations.

To find the location of a specific percentile, software such as Minitab, Python, R, and Excel uses these steps:

- Arrange the observations in ascending order.
- Use the formula for percentile location to calculate the position where the percentile value would be located, using the desired percentile value and the total number of observations as inputs. There are two approaches: EXC (Exclusive) and INC (Inclusive). The percentile location in the EXC approach is given by the formula \(K(N+1)\), and the location in the INC approach is given by the formula \(K(N-1)+1\).
- If the percentile location is a whole number, the value at that position in the ordered list of observations is the percentile value.
- If the percentile location is not a whole number, the percentile value is calculated by calculating the value on a pro-rata basis between these two numbers.

To find the 65th percentile in a group of 8 observations, you would first arrange the observations in ascending order: 8, 9, 12, 22, 23, 33, 55, 61.

Next, you would use the formula for percentile location to calculate the position where the 65th percentile would be located:

For PERCENTILE.EXC, the calculated rank is \(K(N+1)\).

Percentile location (Using Exclusive Approach) = \(\left(\frac{65}{100}\right)(8+1)\) = 5.85

Since the percentile location is not a whole number, the 65th percentile would be between the 5th item (number 23)and the 6th item (number 33) on a pro-rata basis. This will be \(23+0.85(33-23) = 31.5\).

For PERCENTILE.INC (and PERCENTILE), the calculated rank is \(K(N-1)+1\).

Percentile location (Using Inclusive Approach) = \((65/100) (8-1)+1\) = 5.55

Since the percentile location is not a whole number, the 65th percentile would be almost in the middle of the 5th item (number 23)and the 6th item (number 33). On a pro-rata basis, this will come out to be \(23+0.55(33-23)\) = 28.5.

## Quartile: Manual Calculation Example

A quartile is a statistical value that divides a dataset into four equal parts or quarters. The first quartile, also known as the lower quartile or Q1, is the value that separates the lowest 25% of the data from the rest. The second quartile, also known as the median or Q2, is the value that separates the lowest 50% of the data from the highest 50% of the data. The third quartile, also known as the upper quartile or Q3, is the value that separates the highest 25% of the data from the rest.

For example, if we have the following numbers: 14, 9, 10, 11, 11, and 6, we can divide the data into four equal groups by finding the first, second, and third quartiles.

To find the quartiles of a dataset, we first need to arrange the data in ascending order like this: 6, 9, 10, 11, 11, 14.

Next, we need to find the median, or Q2, which is the middle value in the dataset. In this case, there are six numbers in the dataset, so the median is the average of the third and fourth values, which is 10.5.

To find the lower quartile or Q1, we take the median of the values below the median. In this case, that would be the median of 9. To find the upper quartile or Q3, we take the median of the values above the median. In this case, that would be the median of 11,11 and 14, which is 11.

So, for this dataset, the quartiles are:

Q1 = 9

Q2 = 10.5

Q3 = 11

## Why do these quartile calculation numbers not match with Excel calculation?

## Quartile Using Excel:

To calculate quartiles, software such as Microsoft Excel and Minitab uses the percentile approach, as explained earlier. Q1 is calculated as the 25th percentile, Q2 as 50th and Q3 as the 75th percentile. This results in the value of the quartile sometimes being different from the one calculated using a conventional manual calculation approach.

Let's take the same example we used earlier in the manual calculation to calculate the first quartile (Q1).

To find the quartiles of a dataset, we first need to arrange the data in ascending order like this: 6, 9, 10, 11, 11, 14.

You can use the QUARTILE.EXC or QUARTILE.INC function to find the quartiles of a set of numbers in Excel.

## Quartile.Exc

For QUARTILE.EXC, the calculated rank is K*(N+1). To calculate the location of Q1 (or 25th percentile), let's put appropriate values in this formula.

1st Quartile location (Using Exclusive Approach) = (25/100) * (6+1) = 1.75

Since the percentile location is not a whole number, the 1st quartile would be between the 1st item (number 6)and the 2nd item (number 9) on a pro-rata basis. This will come out to be \(6 + (9-6)*0.75\) = 8.25.

**Using Minitab:** If you use Minitab to calculate Q1, this is the value (8.25) you get in the descriptive statistics. Minitab uses EXC approach to calculate percentile and quartile.

## Quartile.Inc

For QUARTILE.INC (and QUARTILE), the calculated rank is K*(N-1)+1. To calculate the location of Q1 (or 25th percentile), let's put appropriate values in this formula.

Quartile location (Using Inclusive Approach) = (25/100) * (6-1) + 1 = 5/4 + 1 = 2.25

Since the percentile location is not a whole number, the 1st quartile would be between the 2nd item (number 9)and the 3rd item (number 10). On a pro-rate basis this will come out to be \(9 + (10-9)*0.25\) = 9.25.

## Summary

In general, to find the p quantile, you would arrange the observations in ascending order and use the formula for percentile location to calculate the position where the p quantile would be located. If the percentile location is a whole number, the p quantile is the value at that position in the ordered list of observations. If the percentile location is not a whole number, the p quantile is calculated on pro-rata basis.