Unexpected caching behavior using polymorphic records in Haskell

Hold my beer.

{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE ConstraintKinds #-}
import Debug.Trace

data Dict c where Dict :: c => Dict c

-- An isomorphism between explicit dictionary-passing style (Dict c -> a)
-- and typeclass constraints (c => a) exists:
from :: (c => a) -> (Dict c -> a)
from v Dict = v

to :: (Dict c -> a) -> (c => a)
to f = f Dict

data Translation = Trans { m :: forall a . Floating a => a }

f1, f2 :: Dict (Floating a) -> a -> a
f1 = trace "hello" $ \Dict x -> x - 2.0
f2 = \Dict -> trace "hello" $ \x -> x - 2.0

main = do
    let trans1 = Trans { m = to (flip f1 1.5) }
        trans2 = Trans { m = to (flip f2 1.5) }
    putStrLn "trans1"
    print (m trans1)
    print (m trans1)
    putStrLn "trans2"
    print (m trans2)
    print (m trans2)

Take a second to predict what this will output before you run it. Then go ask your GHC if she agrees with your guess.

Clear as mud?

The basic distinction you need to draw here is right here in this significantly simplified example:

> g = trace "a" $ \() -> trace "b" ()
> g ()
a
b
()
> g ()
b
()

There is a separate notion of caching a function and caching its output. The latter is, simply, never done in GHC (though see discussion of what's going on with your optimized version below). The former may sound dumb, but it in fact is not so dumb as you might think; you could imagine writing a function which is, say, id if the collatz conjecture is true and not otherwise. In such a situation, it makes complete sense to only test the collatz conjecture once, and then cache whether we should behave as id or not forever afterwards.

Once you understand this basic fact, the next leap you must believe is that in GHC, typeclass constraints are compiled to functions. (The arguments to the function are typeclass dictionaries telling how each of the typeclass' methods behave.) GHC itself manages constructing and passing these dictionaries around for you, and in most cases it's quite transparent to the user.

But the upshot of this compilation strategy is this: a polymorphic but typeclass-constrained type is a function even if it doesn't appear to have function arrows in it. That is,

f 1.5 :: Floating a => a

looks like a plain old value; but in fact it is a function which takes a Floating a dictionary and produces a value of type a. So any computations that go into computing the value a are redone afresh each time this function is applied (read: used at a specific monomorphic type) because, after all, the precise value chosen depends critically on how the typeclass' methods behave.

This leaves only the question of why optimizations changed things in your situation. There I believe what happened is called "specialization", in which the compiler will try to notice when polymorphic things get used at a statically-known monomorphic type and make a binding for that. It goes something like this:

-- starting point
main = do
    let trans = \dict -> trace "hello" $ minus dict (fromRational dict (3%2)) (fromRational dict (2%1))
    print (trans dictForDouble)
    print (trans dictForDouble)

-- specialization
main = do
    let trans = \dict -> trace "hello" $ minus dict (fromRational dict (3%2)) (fromRational dict (2%1))
    let transForDouble = trans dictForDouble
    print transForDouble
    print transForDouble

-- inlining
main = do
    let transForDouble = trace "hello" $ minus dictForDouble (fromRational dict (3%2)) (fromRational dictForDouble (2%1))
    print transForDouble
    print transForDouble

In this last one the function-ness is gone; it is "as if" GHC has cached the output of trans when applied to the dictionary dictForDouble. (If you compile with optimizations and -ddump-simpl you will see it goes even further, doing constant-propagation to turn the minus ... stuff into just D# -0.5##. Whew!)