Creating a small programming language for beginners

You might consider learning to build a compiler from a fabulous 1964 (yes, you read that right) paper META II: A Syntax-Oriented Compiler Writing Language on how to build "meta compilers".

This paper contains, in 10 pages, a compiler writing philosophy, a definition of a virtual compiler instruction set that is easy to implement, a compiler-compiler, and an example compiler built using the compiler-compiler.

I learned initially how to build compilers from this paper back in 1970 odd. It is astonishing how clever and conceptually simple it is.

If there was a paper I'd make every computer science student read, this would be it.

You can get the paper, see a tutorial and an implementation of MetaII in JavaScript here. The guy behind the tutorial is Dr. James Neighbors, source of the term "domain analysis".


My take on this is that the simpler approach is not to mess with the hardcore tools intended to automate creation of lexers and parsers but rather to embark on creating a simple interpreter of a reasonably simple syntax, for which the parser could be written by hand — involving a simple state machine.

I'd say an almost ideal interpreted language for the task is Tcl, with the reasons for this being:

  • Super-simple syntax (almost absense of it): the syntax mostly defines grouping of characters into words.

    A set of words parsed from a source line of the script being processed is then interpreted this way: the first word is the name of a command and the rest of them are positional arguments to that command. Get the quick overview of it on Wikipedia.

  • Everything is a string as interpretation of the arguments is in the hands of the commands processing them.

  • It's possible to gradually make the interpreter more "beefy". For instance, for a start one might omit implementing variable substitutions (Tcl < 2.0 did not have them) and all the advanced features (both syntactic and semantic, like namespaces). Likewise, it's possible to start with no commands or almost no commands available to the scripts of such a toy interpreter. Then they might be gradually added.

You can get a quick sense of what Tcl is by working through its tutorial. More documentation is there. You can get Tcl to play with it there.

You can ask for help on the #tcl IRC channel on irc.freenode.net and on the comp.lang.tcl newsgroup (available via Google Groups).