What cases do the GHC occurs check identify?

Think of type inference as solving a system of equations. Let's look at an example:

f x = (x,2)

How can we deduce the type of f? We see that it is a function:

f :: a -> b

Additionally, from the structure of f we can see that the following equations hold simulatenously:

b = (c,d)
d = Int
c = a

By solving this system of equations we can see that the type of f is a -> (a, Int). Now let's look at the following (erroneous) function:

f x = x : x

The type of (:) is a -> [a] -> [a], so this generates the following system of equations (simplified):

a = a
a = [a]

So we get an equation a = [a], from which we can conclude that this system of equations doesn't have a solution, and therefore the code is not well-typed. If we didn't reject the equation a = [a], we would indeed go in an infinite loop adding equations a = [a], a = [[a]], a = [[[a]]], etc to our system (alternatively, as Daniel notes in his answer, we could allow infinite types in our type system, but that would make erroneous programs such as f x = x : x to typecheck).

You can also test this in ghci:

> let f x = x : x

<interactive>:2:15:
    Occurs check: cannot construct the infinite type: a0 = [a0]
    In the second argument of `(:)', namely `x'
    In the expression: x : x
    In an equation for `f': f x = x : x

As to your other questions: GHC Haskell's type system is not Turing-complete and the typechecker is guaranteed to halt - unless you enable UndecidableInstances, in which case it theoretically can go in an infinite loop. GHC, however, ensures termination by having a fixed-depth recursion stack, so it's not possible to construct a program on which it never halts (edit: according to C.A.McCann's comment, it is possible after all - there's an analogue of tail recursion on the type level that allows to loop without increasing the stack depth).

It is, however, possible to make compilation take arbitrarily long time, since the complexity of Hindley-Milner type inference is exponential in the worst (but not the average!) case:

dup x = (x,x)

bad = dup . dup . dup . dup . dup . dup . dup . dup
      . dup . dup . dup . dup . dup . dup . dup . dup
      . dup . dup . dup . dup . dup . dup . dup . dup
      . dup . dup . dup . dup . dup . dup . dup . dup

Safe Haskell won't protect you from this - take a look at mueval if you want to allow potentially malicious users compile Haskell programs on your system.


The GHC occurs check prevents you from constructing infinite types.

This is true only in the very literal sense of preventing types that are syntactically infinite. What's really going on here is just a recursive type where the unification algorithm would need to, in a sense, inline the recursion.

It's always possible to define exactly the same type by making the recursion explicit. This can even be done generically, using a type-level version of fix:

newtype Fix f = Fix (f (Fix f))

As an example, the type Fix ((->) a) is equivalent to unifying b with (a -> b).

In practice, however, "infinite type" errors almost always indicate an error in the code (so if it's broke, you probably shouldn't Fix it). The usual scenario is mixing up argument order or otherwise having an expression in the wrong place in code that doesn't use explicit type signatures.

A type inference system that was extremely naive in the right way could potentially expand the recursion until it ran out of memory, but the halting problem doesn't enter into it--if a type needs to unify with part of itself, that's never going to work (at least in Haskell, there may be languages that instead treat it as equivalent to the explicitly recursive newtype above).

Neither type checking nor type inference in GHC are Turing-complete unless you enable UndecidableInstances, in which case you can do arbitrary computation via functional dependencies or type families.

Safe Haskell doesn't really enter the picture at all. It's easy to generate very large inferred types that will exhaust system memory despite being finite, and if memory serves me Safe Haskell doesn't restrict use of UndecidableInstances anyway.


I have the following wonderful mail in my bookmarks: There's nothing wrong with infinite types! Even with infinite types, there's no real danger of making the typechecker loop. The type system is not Turing complete (unless you explicitly ask for it to be with something like the UndecidableInstances extension).