Scala: What is the generic way to calculate standard deviation

Travis Brown's answer is well written, however, you can simply make the Numeric[A] an implicit parameter

def StandardDeviation[A](a: Seq[A])(implicit num: Numeric[A]):Double = {

  def mean(a: Seq[A]): Double = num.toDouble(a.sum) / a.size 

  def variance(a: Seq[A]): Double = {
    val avg = mean(a)
    a.map(num.toDouble).map(x => math.pow((x - avg),2)).sum / a.size 
  }

  math.sqrt(variance(a))

}

The goal of a type class like Numeric is to provide a set of operations for a type so that you can write code that works generically on any types that have an instance of the type class. Numeric provides one set of operations, and its subclasses Integral and Fractional additionally provide more specific ones (but they also characterize fewer types). If you don't need these more specific operations, you can simply work at the level of Numeric, but unfortunately in this case you do.

Let's start with mean. The problem here is that division means different things for integral and fractional types, and isn't provided at all for types that are only Numeric. The answer you've linked from Daniel gets around this issue by dispatching on the runtime type of the Numeric instance, and just crashing (at runtime) if the instance isn't either a Fractional or Integral.

I'm going to disagree with Daniel (or at least Daniel five years ago) and say this isn't really a great approach—it's both papering over a real difference and throwing out a lot of type safety at the same time. There are three better solutions in my view.

Only provide these operations for fractional types

You might decide that taking the mean isn't meaningful for integral types, since integral division loses the fractional part of the result, and only provide it for fractional types:

def mean[T: Fractional](xs: Iterable[T]): T = {
  val T = implicitly[Fractional[T]]

  T.div(xs.sum, T.fromInt(xs.size))
}

Or with the nice implicit syntax:

def mean[T: Fractional](xs: Iterable[T]): T = {
  val T = implicitly[Fractional[T]]
  import T._

  xs.sum / T.fromInt(xs.size)
}

One last syntactic point: if I find I have to write implicitly[SomeTypeClass[A]] to get a reference to a type class instance, I tend to desugar the context bound (the [A: SomeTypeClass] part) to clean things up a bit:

def mean[T](xs: Iterable[T])(implicit T: Fractional[T]): T =
  T.div(xs.sum, T.fromInt(xs.size))

This is entirely a matter of taste, though.

Return a concrete fractional type

You could also make mean return a concrete fractional type like Double, and simply convert the Numeric values to that type before performing the operation:

def mean[T](xs: Iterable[T])(implicit T: Numeric[T]): Double =
  T.toDouble(xs.sum) / xs.size

Or, equivalently but with the toDouble syntax for Numeric:

import Numeric.Implicits._

def mean[T: Numeric](xs: Iterable[T]): Double = xs.sum.toDouble / xs.size

This provides correct results for both integral and fractional types (up to the precision of Double), but at the expense of making your operation less generic.

Create a new type class

Lastly you could create a new type class that provides a shared division operation for Fractional and Integral:

trait Divisible[T] {
  def div(x: T, y: T): T
}

object Divisible {
  implicit def divisibleFromIntegral[T](implicit T: Integral[T]): Divisible[T] =
    new Divisible[T] {
      def div(x: T, y: T): T = T.quot(x, y)
    }

  implicit def divisibleFromFractional[T](implicit T: Fractional[T]): Divisible[T] =
    new Divisible[T] {
      def div(x: T, y: T): T = T.div(x, y)
    }
}

And then:

def mean[T: Numeric: Divisible](xs: Iterable[T]): T =
  implicitly[Divisible[T]].div(xs.sum, implicitly[Numeric[T]].fromInt(xs.size))

This is essentially a more principled version of the original mean—instead of dispatching on subtype at runtime, you're characterizing the subtypes with a new type class. There's more code, but no possibility of runtime errors (unless of course xs is empty, etc., but that's an orthogonal problem that all of these approaches run into).

Conclusion

Of these three approaches, I'd probably choose the second, which in your case seems especially appropriate since your variance and stdDev already return Double. In that case the entire thing would look like this:

import Numeric.Implicits._

def mean[T: Numeric](xs: Iterable[T]): Double = xs.sum.toDouble / xs.size

def variance[T: Numeric](xs: Iterable[T]): Double = {
  val avg = mean(xs)

  xs.map(_.toDouble).map(a => math.pow(a - avg, 2)).sum / xs.size
}

def stdDev[T: Numeric](xs: Iterable[T]): Double = math.sqrt(variance(xs))

…and you're done.

In real code I'd probably look at a library like Spire instead of using the standard library's type classes, though.

Tags:

Generics

Scala