Type erasure in Haskell?

Implementations of dynamically typed languages typically need to store type information with each value in memory. This isn't too bad for Lisp-like languages that have just a few types and can reasonably identify them with a few tag bits (although such limited types lead to other efficiency issues). It's much worse for a language with lots of types. Haskell lets you carry type information to runtime, but it forces you to be explicit about it, so you can pay attention to the cost. For example, adding the context Typeable a to a type offers a value with that type access, at runtime, to a representation of the type of a. More subtly, typeclass instance dictionaries are usually specialized away at compile time, but in sufficiently polymorphic or complex cases may survive to runtime. In a compiler like the apparently-abandoned JHC, and one likely possibility for the as-yet-barely-started compiler THC, this could lead to some type information leaking to runtime in the form of pointer tagging. But these situations are fairly easy to identify and only rarely cause serious performance problems.


What I don't understand is how are "all Haskell functions" parametric?

It doesn't say all Haskell functions are parametric, it says:

All Haskell functions must be parametric in their type parameters.

A Haskell function need not have any type parameters.

One more question: why does the author say that "Despite how important types are when writing Haskell code, they are completely irrelevant when running Haskell code"?

Unlike a dynamically typed language where you need to check at run time if (for example) two things are numbers before trying to add them together, your running Haskell program knows that if you're trying to add them together, then they must be numbers because the compiler made sure of it beforehand.

Aren't types explicit/static in Haskell?

Types in Haskell can often be inferred, in which case they don't need to be explicit. But you're right that they're static, and that is actually why they don't matter at run time, because static means that the compiler makes sure everything has the type that it should before your program ever executes.


Types can be erased in Haskell because the type of an expression is either know at compile time (like True) or its type does not matter at runtime (like []).

There's a caveat to this though, it assumes that all values have some kind of uniform representation. Most Haskell implementations use pointers for everything, so the actual type of what a pointer points to doesn't matter (except for the garbage collector), but you could imagine a Haskell implementation that uses a non-uniform representation and then some type information would have to be kept.


Others have already answered, but perhaps some examples can help.

Python, for instance, retains type information until runtime:

>>> def f(x):
...   if type(x)==type(0):
...     return (x+1,x)
...   else:
...     return (x,x)
... 
>>> f("hello")
('hello', 'hello')
>>> f(10)
(11, 10)

The function above, given any argument x returns the pair (x,x), except when x is of type int. The function tests for that type at runtime, and if x is found to be an int it behaves in a special way, returning (x+1, x) instead.

To realize the above, the Python runtime must keep track of types. That is, when we do

>>> x = 5

Python can not just store the byte representation of 5 in memory. It also needs to mark that representation with a type tag int, so that when we do type(x) the tag can be recovered.

Further, before doing any operation such as x+1 Python needs to check the type tag to ensure we are really working on ints. If x is for instance a string, Python will raise an exception.

Statically checked languages such as Java do not need such checks at runtime. For instance, when we run

SomeClass x = new SomeClass(42);
x.foo();

the compiler has already checked there's indeed a method foo for x at compile time, so there's no need to do that again. This can improve performance, in principle. (Actually, the JVM does some runtime checks at class load time, but let's ignore those for the sake of simplicity)

In spite of the above, Java has to store type tags like Python does, since it has a type(-) analogous:

if (x instanceof SomeClass) { ...

Hence, Java allows one to write functions which can behave "specially" on some types.

// this is a "generic" function, using a type parameter A
<A> A foo(A x) {
   if (x instanceof B) {  // B is some arbitrary class
      B b = (B) x;
      return (A) new B(b.get()+1);
   } else {
      return x;
   }
}

The above function foo() just returns its argument, except when it's of type B, for which a new object is created instead. This is a consequence of using instanceof, which requires every object to carry a tag at runtime.

To be honest, such a tag is already present to be able to implement virtual methods, so it does not cost anything more. Yet, the presence of instanceof makes it possible to cause the above non-uniform behaviour on types -- some types can be handled differently.

Haskell, instead has no such type/instanceof operator. A parametric Haskell function having type

foo :: a -> (a,a)

must behave in the same way at all types. There's no way to cause some "special" behaviour. Concretely, foo x must return (x,x), and we can see this just by looking at the type annotation above. To stress the point, there's no need to look at the code (!!) to prove such property. This is what parametricity ensures from the type above.

Tags:

Types

Haskell