What is a principal type?

To build a little bit upon gasche's explanation, here's an example, stolen from OCaml's own testsuite, where principality, or lack thereof, appears. Disclaimer: this uses objects.

class c = object method id : 'a. 'a -> 'a = fun x -> x end;;
type u = c option;;
let just = function None -> failwith "just" | Some x -> x;;
let f x = let l = [Some x; (None : u)] in (just(List.hd l))#id;;
let g x =
  let none = (fun y -> ignore [y;(None:u)]; y) None in
  let x = List.hd [Some x; none] in (just x)#id;;
let h x =
  let none = let y = None in ignore [y;(None:u)]; y in 
  let x = List.hd [Some x; none] in (just x)#id;;

When running this in a top-level session, you'll get:

#     val f : c -> 'a -> 'a = <fun>
#     val g : c -> 'a -> 'a = <fun>
#     val h : < id : 'a; .. > -> 'a = <fun>

These functions perform the exact same thing but get assigned different types!

If you invoke OCaml with the -principal option, the first two examples will trigger:

Warning 18: this use of a polymorphic method is not principal.

The interesting thing is that if you replace 'a in the type of h with 'a -> 'a, you get the exact same type as f and g, since the type c is just a type abbreviation for (i.e. expands to) < id: 'a -> 'a; .. >.

What the compiler wants to tell the programmer here is that there are two suitable types for f and g, and that there is no criterion that would help OCaml decide which one is the "best" (and by "best", I mean "principal", of course!).


The process of type inference is the fact of guessing, given an user-written program, what the type of this program is. In general, there may be several correct types for a given program. For example, the program fun x -> x can be given the types int -> int and bool -> bool.

Given a program, a type for this program is principal if it is the most general type that can be given to this program, in the sense that all other possible types are specialization (instances) of this type. With my fun x -> x example, the polymorphic 'a -> 'a is a principal type.

In some type systems, principal types do not always exist. You have a program P with two possible types T1 and T2, none of them being more general than the other. For example, in some systems where numeric operators are overloaded, the program fun x -> x + x can be given the type int -> int and the type float -> float, and there is no type that subsumes both. This is a problem for the inference engine, because it means that it has to make an arbitrary choice, pick one of the possible type without knowing if it's the one the user intended. If you have principal types, the inference process does not need to make any choice.

(To solve that example you could: (1) not overload arithmetic operators (2) make an arbitrary choice (that's what F# does iirc) (3) reject the program and ask for a type annotation to remove the ambiguity (4) have more expressive types such as Haskell's Num a => a -> a.)

The "simple" subset of the OCaml language, based on Hindley-Milner type inference, has principal types. This means that the inference engine always does the right thing (given the specification of the possible types). Some more advanced aspects of the type system (eg. polymorphic fields and methods) lose this property: in some cases the type-system can't find a most general type, or finding the most general type would require sensibly more complex computations from the type inference engine (which generally tries to be fast). The -principal option is a knob that will, if I remember correctly:

  • fail is some case where the type-checker would otherwise have accepted a non-principal solution (made an arbitrary choice)
  • try harder to find a principal solution, at the cost of longer type-checking time and memory usage

I'm not very familiar with this flag (I prefer to avoid too advanced type-system features so my program are usually not concerned), so you would have to double-check this, but that is the rough idea. This flag is relatively unimportant in my opinion (you don't usually need to care), but the idea of principal types is indeed an important part of the theory of ML languages.

Two more technical details if you wish to go further:

  • The ML notion of "principal types" is the question of whether there exist a most general type given a fixed type environment; some authors study the question of whether they exist a most general (environment, type) pair, so-called a "principal typing"; it is a harder question (you have to infer what you expect from the other modules, while in ML you are given signatures for the external modules you rely on; and inferring polymorphism is very hard) that is not used in most ML-inspired programming languages.

  • Existence of principal types is a delicate equilibrium for designers of type system. If you remove features from the ML type system (polymorphism, types such as 'a -> 'a), you lose principality, but if you add power (going from ML to System F which has more expressive polymorphic types) you may also lose principality. You can regain that lost principality by moving to even more complex type systems such as MLF, but it is a hard problem.

In practice, a relatively large part of the designers of programming languages have lately given up on the idea of principality. They want to have more ambitious type systems (dependent types, etc.) where it's just too hard to seek principality, so they instead content themselves with non-principal inference: it's already good if the inference engine can find some type, let's not be difficult on the generality of the result. Jacques Garrigue, the main maintainer of the OCaml type system, still cares about it very much, and I think that's an interesting aspect of the OCaml programming language research.

Tags:

Ocaml