Learning to write a compiler

Big List of Resources:

  • A Nanopass Framework for Compiler Education ¶
  • Advanced Compiler Design and Implementation $
  • An Incremental Approach to Compiler Construction ¶
  • ANTLR 3.x Video Tutorial
  • Basics of Compiler Design
  • Building a Parrot Compiler
  • Compiler Basics
  • Compiler Construction $
  • Compiler Design and Construction $
  • Crafting a Compiler with C $
  • Crafting Interpreters
  • [Compiler Design in C] 12 ¶
  • Compilers: Principles, Techniques, and Tools $ — aka "The Dragon Book"; widely considered "the book" for compiler writing.
  • Engineering a Compiler $
  • Essentials of Programming Languages
  • Flipcode Article Archive (look for "Implementing A Scripting Engine by Jan Niestadt")
  • Game Scripting Mastery $
  • How to build a virtual machine from scratch in C# ¶
  • Implementing Functional Languages
  • Implementing Programming Languages (with BNFC)
  • Implementing Programming Languages using C# 4.0
  • Interpreter pattern (described in Design Patterns $) specifies a way to evaluate sentences in a language
  • Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages $
  • Let's Build a Compiler by Jack Crenshaw — The PDF ¶ version (examples are in Pascal, but the information is generally applicable)
  • Linkers and Loaders $ (Google Books)
  • Lisp in Small Pieces (LiSP) $
  • LLVM Tutorial
  • Modern Compiler Implementation in ML $ — There is a Java $ and C $ version as well - widely considered a very good book
  • Object-Oriented Compiler Construction $
  • Parsing Techniques - A Practical Guide
  • Project Oberon ¶ - Look at chapter 13
  • Programming a Personal Computer $
  • Programing Languages: Application and Interpretation
  • Rabbit: A Compiler for Scheme¶
  • Reflections on Trusting Trust — A quick guide
  • Roll Your Own Compiler for the .NET framework — A quick tutorial from MSDN
  • Structure and Interpretation of Computer Programs
  • Types and Programming Languages
  • Want to Write a Compiler? - a quick guide
  • Writing a Compiler in Ruby Bottom Up
  • Compiling a Lisp — compile directly to x86-64

Legend:

  • ¶ Link to a PDF file
  • $ Link to a printed book

This is a pretty vague question, I think; just because of the depth of the topic involved. A compiler can be decomposed into two separate parts, however; a top-half and a bottom-one. The top-half generally takes the source language and converts it into an intermediate representation, and the bottom half takes care of the platform specific code generation.

Nonetheless, one idea for an easy way to approach this topic (the one we used in my compilers class, at least) is to build the compiler in the two pieces described above. Specifically, you'll get a good idea of the entire process by just building the top-half.

Just doing the top half lets you get the experience of writing the lexical analyzer and the parser and go to generating some "code" (that intermediate representation I mentioned). So it will take your source program and convert it to another representation and do some optimization (if you want), which is the heart of a compiler. The bottom half will then take that intermediate representation and generate the bytes needed to run the program on a specific architecture. For example, the the bottom half will take your intermediate representation and generate a PE executable.

Some books on this topic that I found particularly helpful was Compilers Principles and Techniques (or the Dragon Book, due to the cute dragon on the cover). It's got some great theory and definitely covers Context-Free Grammars in a really accessible manner. Also, for building the lexical analyzer and parser, you'll probably use the *nix tools lex and yacc. And uninterestingly enough, the book called "lex and yacc" picked up where the Dragon Book left off for this part.