Calculate 95th percentile in Ruby?

If your are interested in existing gem, then descriptive_statistics gem is best I found so far for percentile function.

IRB Session

> require 'descriptive_statistics'
=> true
irb(main):009:0> data = [1, 2, 3, 4]
=> [1, 2, 3, 4]
irb(main):010:0> data.percentile(95)
=> 3.8499999999999996
irb(main):011:0> data.percentile(95).round(2)
=> 3.85

Good part of gem is its elegant way of describing "I want 95 percentile of data".


If you want to replicate Excel's PERCENTILE function then try the following:

def percentile(values, percentile)
    values_sorted = values.sort
    k = (percentile*(values_sorted.length-1)+1).floor - 1
    f = (percentile*(values_sorted.length-1)+1).modulo(1)
    
    return values_sorted[k] + (f * (values_sorted[k+1] - values_sorted[k]))
end

values = [1, 2, 3, 4]
p = 0.95
puts percentile(values, p)
#=> 3.85

The formula is based on the QUARTILE method, which is really just a specific percentiles - https://support.microsoft.com/en-us/office/quartile-inc-function-1bbacc80-5075-42f1-aed6-47d735c4819d.


Percentile based on count of items

a = [1,2,3,4,5,6,10,11,12,13,14,15,20,30,40,50,60,61,91,99,120]

def percentile_by_count(array,percentile)
  count = (array.length * (1.0-percentile)).floor
  array.sort[-count..-1]
end

# 80th percentile (21 items*80% == 16.8 items are below; pick the top 4)
p percentile_by_count(a,0.8) #=> [61, 91, 99, 120]

Percentile based on range of values

def percentile_by_value(array,percentile)
  min, max = array.minmax
  range = max - min
  min_value = (max-min)*percentile + min
  array.select{ |v| v >= min_value }
end

# 80th percentile (119 * 80% = 95.2; pick values above this)
p percentile_by_value(a,0.8) #=> [99, 120]

Interestingly, Excel's PERCENTILE function returns 60 as the first value for the 80th percentile. If you want this result—if you want an item falling on the cusp of the limit to be included— then change the .floor above to .ceil.