Assigning literals to terms in GHC

Because these literals are patterns, so you are using pattern bindings.

let bindings in Haskell are lazy, so no actual pattern matching is performed.

But if we force the matching, it indeed fails:

> :{
  let
    x@1 = 0
  in
    1
  :}
1                    -- no assignment!     NB
it :: Num a => a

> :{
  let
    x@1 = 0
  in
    x
  :}
*** Exception: <interactive>:87:1-7: Irrefutable pattern failed for pattern x@1

because 1 indeed is not 0.

So this it not a bug nor a feature of GHC's implementation, but rather a feature of the language Haskell itself.

It is allowed because it is a natural consequence of the language rules, and not problematic enough to make a special case in the language specification to prevent it.

Natural consequence
There are two standard kinds of definitions: function definitions and data definitions. In a function definition, you are allowed to write patterns as arguments to the function on the left of the equality sign. In data definitions, you are allowed to write a pattern by itself to the left of the equality sign to match against the data to the right of the equality sign. So, these kinds of things are all allowed:

x = 3
x@y = 3
[x, y, z] = [3, 4, 5]
[x, _, z] = [3, 4, 5]
[x, 4, z] = [3, 4, 5]
x:_ = "xsuffix"
x:"suffix" = "xsuffix"

Number literals and string literals are patterns. (The former desugars into a guard that calls (==).) So these are allowed, too:

3 = 3
x@3 = 3
[3, 4, 5] = [3, 4, 5]
"xsuffix" = "xsuffix"
-- and yes, these too
3 = 4
"xsuffix" = "lolno"

Not problematic
As with all other parts of the language, data definitions are lazy: the pattern match calculation they represent is not performed until it's demanded (by inspecting one of the variables bound by the match). Since "hello" = "world" and 1 = 0 do not bind any variables, the pattern match calculation they represent -- which would throw an exception -- is never performed. So it's not super important to avoid allowing them.

...except when it is
But wait... we said this was a valid pattern:

x@3 = 3

And this similar one diverges and binds a variable:

x@3 = 4

How come that one is allowed? That's much harder to answer! I think it would be quite reasonable to try to think of some language rules that would prevent this. In general, a rule that is sound and complete would of course be undecidable, since the right hand side of the equation can do arbitrary computation. But there are other choices you could make; for example:

Do not allow refutable patterns in data definitions. A pattern is refutable if it could fail to match. For example, x is irrefutable, x@y is irrefutable, _ is irrefutable, but x:y is refutable, True is refutable, () is refutable (because it diverges when the RHS is bottom). This is by far the simplest, and would rule out x@3 = 4 and "hello" = "world" both. Unfortunately, it would also rule out very useful things like [x, y, z] = [3, 4, 5].
Implement a termination checker, and require refutable patterns' RHS to terminate. If you had an analysis that could decide that some calculations terminate -- for example, by discovering that all recursion in it were structural or something, there's a whole cottage industry of termination checking algorithms -- then you could have the compiler check that. If it does terminate, the compiler can actually run the computation during compilation, and double-check that the pattern given actually will match the value. The downside of this is that termination-checking algorithms are very complicated, so this puts a big burden on the compiler writer, and some are difficult for humans to predict, which makes programming against it frustrating for the user.
Demand that the programmer prove the match can't fail. You could introduce a mechanism for the programmer to write proofs about their programs, and demand that they prove the match won't fail. This moves in the direction of dependent types; the two main costs of such a move are typically a reduction in program efficiency and that writing programs in such languages requires much more effort.

The language designers made a number of choices (not just in the pattern-matching semantics) that err on the side of making programmers' and compiler-writers' lives a bit easier, but allow a few more programs that throw exceptions or never finish. This is one such spot -- refutable patterns are allowed in data definitions, even if that can cause a crash, because the implementation of that policy is useful, simple, and predictable.

Assigning literals to terms in GHC

Tags:

Literals

Binding

Haskell

Variable Assignment

Ghc

Related

Recent Posts