# Why does a wildcard match work when enumerating all cases doesn't?

In typing a case construct (whether an explicit `case`

statement or an implicit pattern-based function definition) in the **absence** of GADTs, the individual alternatives:

```
pattern -> body
```

can be unified by typing all the patterns and unifying those with the type of the scrutinee, then typing all the bodies and unifying those with the type of the `case`

expression as a whole. So, in a simple example like:

```
data U = UA | UB | UC
isA1 u = case u of
UA -> True
UB -> False
x -> False
```

we can initially type the patterns `UA :: U`

, `UB :: U`

, `x :: a`

, unify them using the type equality `a ~ U`

to infer the type of the scrutinee `u :: U`

, and similarly unify `True :: Bool`

and both of the `False :: Bool`

to the type of the overall case expression `Bool`

, unifying that with the type of `isA`

to get `isA :: U -> Bool`

.

Note that the process of unification can specialize the types. Here, the type of the pattern `x :: a`

was general, but by the end of the unification process, it had been specialized to `x :: U`

. This can happen with the bodies, too. For example:

```
len mstr = case mstr of
Nothing -> 0
Just str -> length str
```

Here, `0 :: Num a => a`

is polymorphic, but because `length`

returns an `Int`

, by the end of the process, the bodies (and so the entire case expression) have been unified to the type `Int`

.

In general, through unification, the common, unified type of all the bodies (and so the type of the overall case expression) will be the "most general" / "least restrictive" type of which the types of the bodies are all generalizations. In some cases, this type might *be* the type of one of the bodies, but in general, all the bodies can be more more general than the "most general" unified type, but no body can be more restrictive.

Things change in the presence of GADTs. When type-checking case constructs with GADTs, the patterns in an alternative can introduce a "type refinement", a set of additional bindings of type variables to be used in type-checking the bodies of the alternative. (This is what makes GADTs useful in the first place.)

Because different alternatives' bodies are typed under different refinements, naive unification isn't possible. For example, consider the tiny typed DSL and its interpreter:

```
data Term a where
Lit :: Int -> Term Int
IsZ :: Term Int -> Term Bool
If :: Term Bool -> Term a -> Term a -> Term a
eval :: Term a -> a
eval t = case t of
Lit n -> n
IsZ t -> eval t == 0
If b t e -> if eval b then eval t else eval e
```

If we were to naively unify the bodies `n :: Int`

, `eval t == 0 :: Bool`

, and `if eval b then eval t else eval e :: a`

, the program wouldn't type check (most obviously, because `Int`

and `Bool`

don't unify!).

In general, because type refinements allow the calculated types of the alternatives' bodies to be *more* specific than the final type, there's no obvious "most general" / "least restrictive" type to which all bodies can be unified, as there was for the case expression without GADTs.

Instead, we generally need to make available a "target" type for the overall case expression (e.g., for `eval`

, the return type `a`

in the type signature), and then determine if under each refinement introduced by a constructor (e.g., `IsZ`

introducing the refinement `a ~ Bool`

), the body `eval t == 0 :: Bool`

has as its type the associated *refinement* of `a`

.

If no target type is explicitly provided, then the best we can do -- in general -- is use a fresh type variable `p`

as the target and try to check each refined type against that.

This means that, given the following definition without a type signature for `isA2`

:

```
data P t where
PA :: P Int
PB :: P Double
PC :: P Char
isA2 = \p -> case p of
PA -> True
PB -> False
PC -> False
```

what GHC tries to do is type `isA2 :: P t -> p`

. For the alternative:

```
PA -> True
```

it types `PA :: P t`

giving the refinement `t ~ Int`

, and under this refinement, it tries to type `True :: p`

. Unfortunately, `p`

is not `Bool`

under any refinement involving the unrelated type variable `a`

, and we get an error. Similar errors are generated for the other alternatives, too.

Actually, there's one more thing we can do. If there are alternatives that don't introduce a type refinement, then the calculated types of their bodies are **NOT** more specific than the final type. So, if we unify the body types for "unrefined" alternatives, the resulting type provides a legitimate unification target for the refined alternatives.

This means that, for the example:

```
isA3 = \p -> case p of
PA -> True
x -> False
```

the second alternative:

```
x -> False
```

is typed by matching the pattern `x :: P t`

which introduces no type refinement. The unrefined type of the body is `Bool`

, and this type is an appropriate target for unification of the other alternatives.

Specifically, the first alternative:

```
PA -> True
```

matches with a type refinement `a ~ Int`

. Under this refinement, the actual type of the body `True :: Bool`

matches the "refinement" of the target type `Bool`

(which is "refined" to `Bool`

), and the alternative is determined to have valid type.

So, the intuition is that, without a wildcard alternative, the inferred type for the case expression is an arbitrary type variable `p`

, which is too general to be unified with the type refining alternatives. However, when you add a wildcard case alternative `_ -> False`

, it introduces a more restrictive body type `Bool`

into the unification process that, having been deduced without any type refinement by the pattern `_`

, can inform the unification algorithm by providing a more restrictive type `Bool`

to which the other, type refined alternatives, can be unified.

Above, I've made it sound like there's some two-phase approach, where the "non-refining" alternatives are examined first to determine a target type, and then the refining alternatives are checked against it.

In fact, what happens is that the refinement process introduces fresh variables into the unification process that, even when they're unified, don't affect the larger type context. So, all the alternatives are unified at once, but unification of the unrefining alternatives affects the larger type context while unification of the refined alternatives affects a bunch of fresh variables, giving the same end result as if the unrefined and refined alternatives were processed separately.