# A Custom Percentile VBA Function

July 17, 2013

Introduction

The definition of percentile according to Wikipedia is the following: “A percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. For example, the 20th percentile is the value (or score) below which 20 percent of the observations may be found. The term percentile and the related term percentile rank are often used in the reporting of scores from norm-referenced tests. For example, if a score is in the 86th percentile, it is higher than 86% of the other scores”.

This week I had to find the 95th percentile from various experimental datasets (velocity profiles).  Since I haven’t done something similar in the past, my initial thought was to use one of the three built-in functions of Excel (2010): PERCENTILE.EXC, PERCENTILE.INC, and PERCENTILE. Note that the last one is available in Excel 2010 for backward compatibility with older Excel versions, and it is quite similar with PERCENTILE.INC function.

Problem description

Although I tried all of them I couldn’t get the desired result. The reason was that all of these functions use interpolation to find the k-th percentile whenever k is not a multiple of 1/(N – 1), where N is the number of ordered values in the input array. Below there is a short description of these functions, as well as a short reference to the interpolation issue based on Excel help:

PERCENTILE.EXC(array,k): Returns the k-th percentile of values in a range, where k is in the range 0..1, exclusive. If k is not a multiple of 1/(N – 1), PERCENTILE.EXC interpolates to determine the value at the k-th percentile.

PERCENTILE.INC(array,k):
Returns the k-th percentile of values in a range, where k is in the range 0..1, inclusive. You can use this function to establish a threshold of acceptance. For example, you can decide to examine candidates who score above the 90th percentile. If k is not a multiple of 1/(N – 1), PERCENTILE.INC interpolates to determine the value at the k-th percentile.

PERCENTILE(array,k): Returns the k-th percentile of values in a range. You can use this function to establish a threshold of acceptance. For example, you can decide to examine candidates who score above the 90th percentile. If k is not a multiple of 1/(N – 1), PERCENTILE interpolates to determine the value at the k-th percentile.

Let’s see an example: Assume that we have 5 values – 1, 2, 3, 4 and 5 – in the range A1:A5. If we calculate the 45th percentile the results using the built-in functions will be the following:
PERCENTILE.EXC(A1:A5;0.45) = 2.7
PERCENTILE.INC(A1:A5;0.45) = 2.8
PERCENTILE(A1:A5;0.45) = 2.8

The problem is that both values – 2.7 and 2.8 – are not included in the original array/range (1 to 5), due to the interpolation issue. So, for this case I needed a function that for a given percentile will return a value from the original array/range. In this specific example the desired function should return the value of 3. For the same reason if the original range was the 5, 6 60, 66, 100, then the results would be 43.8, 49.2 and 49.2 from PERCENTILE.EXC, PERCENTILE.INC and PERCENTILE respectively. However, in that case the desired 45th percentile should be 60, which is quite different from 43.8 and 49.2.

Solution

Since all the built-in functions of Excel use interpolation to find the exact percentile value, I decided to write a custom VBA function in order to bypass this issue. The “CustomPercentile” function is based on the following formula:

Where:
n: closest rank
P: percentile
N: ordered values

The P-th percentile (0 <= P <= 100) of N ordered values (arranged from least to greatest) is obtained by first calculating the (ordinal) rank n, rounding the result to the nearest integer, and then taking the value that corresponds to that rank. (Note that the rounded value of n is just the least integer which exceeds N x P/100). So, if we have 30 ordered values, then the 75th percentile will be the = round ((75/100) x 30 + 0.5; 0) = 23rd value.

In the initial example (1, 2, 3, 4, 5) the 45th percentile using this formula will give: n = round ((45/100) x 5 + 0.5; 0) = 3. So, the 45th percentile will be the third value in the ordered values, which in the specific example is the value of 3. Similarly, in the second example (5, 6 60, 66, 100) the 45th percentile will be again the third value, which is 60.

VBA code

Here is the code for “CustomPercentile” function. The function first shorts the (numeric) range from the smallest to the ‘largest using a bubble sort function and then calculates the desired percentile value:

`Option ExplicitOption Base 1Function CustomPercentile(InputRange As Range, Percentile As Double) As Variant        '-----------------------------------------------------------------    'The function calculates the desired percentile of a given range.    'First it shorts the (numeric) range from the smallest to the    'largest using BubbleSort function and then calculates the value.        'By Christos Samaras    'Date: 15/7/2013    '[email protected]    'https://myengineeringworld.net/////    '-----------------------------------------------------------------            'Declararing the necessary variables.    Dim Data() As Double    Dim i As Long    Dim j As Long    Dim NearestRank As Long                   'Check if the input range is valid.    If IsRange(InputRange) = False Then        CustomPercentile = "Input Rnage is empty or incorrect!"        Exit Function    End If        'Check if the input percentile is within the valid range.    If Percentile < 0 Or Percentile > 100 Then        CustomPercentile = "Percentile must be between 0 and 100!"        Exit Function    End If        'If the input range is has m rows and n columns then the array    'will have m x n rows and 1 column.    ReDim Data(InputRange.Rows.Count * InputRange.Columns.Count)        'Check if the input range has numeric values and then create a    '1-column array with the input range values.    For i = 1 To InputRange.Rows.Count        For j = 1 To InputRange.Columns.Count            If IsNumeric(InputRange.Cells(i, j)) = True Then                Data(i * j) = InputRange.Cells(i, j)            Else                CustomPercentile = "Numeric error in cell " & InputRange.Cells(i, j).Address & "."                Exit Function            End If        Next j    Next i                'Sort the array from the lowest to the highest value.    Call BubbleSort(Data)        'Find the nearest rank: n = P/100 * N + 1/2 where:    'n is the (ordinal) rank, rounding the result to the nearest integer    'P is the P-th percentile    'N is the number of ordered values (arranged from least to greatest)    NearestRank = Round(Percentile / 100 * (InputRange.Rows.Count * InputRange.Columns.Count) + 0.5, 0)        'Return the percentile's value.    Select Case Percentile        Case 0: CustomPercentile = Data(1)        Case 100: CustomPercentile = Data(NearestRank - 1)        Case Else: CustomPercentile = Data(NearestRank)    End SelectEnd FunctionFunction IsRange(InputRange As Range) As Boolean    '---------------------------------------------    'Checks if the input range is a valid range.        'By Christos Samaras    'Date: 15/7/2013    '[email protected]    'https://myengineeringworld.net/////    '---------------------------------------------        IsRange = False        'If the given range intersects the sheet's range, then the range is valid.    On Error Resume Next    If Not Application.Intersect(InputRange, Range("A1:XFD1048576")) Is Nothing Then        IsRange = True    End IfEnd FunctionFunction BubbleSort(ArrayToShort() As Double) As Variant        '------------------------------------------------------------    'Shorts a given array from the lowest to the highest value.    'Here it is used for numerical values only.        'By Christos Samaras    'Date: 15/7/2013    '[email protected]    'https://myengineeringworld.net/////    '------------------------------------------------------------        Dim i As Long    Dim j As Long    Dim TemporaryValue As Variant    'Loop through each element in the array and if the current array element    'is greater than the elelment next to it, switch the two elements.    For i = LBound(ArrayToShort) To UBound(ArrayToShort) - 1        For j = i + 1 To UBound(ArrayToShort)            If ArrayToShort(i) > ArrayToShort(j) Then                TemporaryValue = ArrayToShort(j)                ArrayToShort(j) = ArrayToShort(i)                ArrayToShort(i) = TemporaryValue            End If        Next j    Next i    End Function`

How to use the function

The function can be used like any Excel built-in function, following the steps below:

1. Open a workbook or create a new one.
2. Press ALT + F11 to switch to VBA editor.
3. From the File menu select import file. Alternatively press CTRL + M.
4. In the pop up window browse to your hard disk and find the mCustomPercentile.bas (module file). Select the file and press the open button.
6. Select an empty cell and type: =CustomPercentile(your range; your percentile). For example in cell B5 you can type: =CustomPercentile(A1:K1; 65). The function will return the 65th percentile from the values in the range A1:K1.
7. Finally, if you want to keep the function at your workbook save the workbook as Excel Macro-Enabled workbook or as Excel 97-2003 workbook.