Reading GHC Core

A tip: If you don't care about type annotations and coercions use -ddump-simpl together with the -dsuppress-all option. The Core output should be much more readable.


GHC Core is the System FC language into which all Haskell is translated. The (approximate) grammar for Core is given by:

enter image description here

Core is closely related to the simpler and better known System F. All transformations GHC does on the Core level are type-preserving refactorings of this Core representation, to improve performance. And, not so well known, you can write directly in Core to program GHC.

GHC Core fits in the compiler pipeline (as it was in 2002, sans-LLVM and CMM):

enter image description here

The primary documents to learn about GHC Core are:

  • An External Representation for the GHC Core Language, Tolmach, 2001
  • GHC.Core.Expr, the GHC definition itself
  • Secrets of the Glasgow Haskell Compiler inliner, Peyton Jones and Marlow, 1999. Core is described in Section 2.3, including details on the occurrence analysis annotations.
  • A transformation-based optimiser for Haskell, Peyton Jones and Santos, 1998. Core is described in S3, including a discussion of polymorphism and operational readings of Core.

Related material that can aid understanding:

  • The GHC -fext-core output
  • I spent a lot of time learning Core by reading GHC source. Some is described in my undergraduate thesis from 2002, from page 16.
  • From using the ghc-core tool, to generate Core in a format I find pleasing.

Core in turn is translated into STG code, which looks something like:

enter image description here

The funny names in Core are encoded in the "Z-encoding":

enter image description here

GHC Core's types and kinds (from Tolmach's paper):

enter image description here

Finally, GHC's primops appear regularly in GHC Core output, when you have optimized your Haskell down to the basic instructions GHC knows about. The primop set is given as a set of Core functions in a pre-processed file.