What is a Y-combinator?

I've lifted this from http://www.mail-archive.com/[email protected]/msg02716.html which is an explanation I wrote several years ago.

I'll use JavaScript in this example, but many other languages will work as well.

Our goal is to be able to write a recursive function of 1 variable using only functions of 1 variables and no assignments, defining things by name, etc. (Why this is our goal is another question, let's just take this as the challenge that we're given.) Seems impossible, huh? As an example, let's implement factorial.

Well step 1 is to say that we could do this easily if we cheated a little. Using functions of 2 variables and assignment we can at least avoid having to use assignment to set up the recursion.

// Here's the function that we want to recurse.
X = function (recurse, n) {
  if (0 == n)
    return 1;
  else
    return n * recurse(recurse, n - 1);
};

// This will get X to recurse.
Y = function (builder, n) {
  return builder(builder, n);
};

// Here it is in action.
Y(
  X,
  5
);

Now let's see if we can cheat less. Well firstly we're using assignment, but we don't need to. We can just write X and Y inline.

// No assignment this time.
function (builder, n) {
  return builder(builder, n);
}(
  function (recurse, n) {
    if (0 == n)
      return 1;
    else
      return n * recurse(recurse, n - 1);
  },
  5
);

But we're using functions of 2 variables to get a function of 1 variable. Can we fix that? Well a smart guy by the name of Haskell Curry has a neat trick, if you have good higher order functions then you only need functions of 1 variable. The proof is that you can get from functions of 2 (or more in the general case) variables to 1 variable with a purely mechanical text transformation like this:

// Original
F = function (i, j) {
  ...
};
F(i,j);

// Transformed
F = function (i) { return function (j) {
  ...
}};
F(i)(j);

where ... remains exactly the same. (This trick is called "currying" after its inventor. The language Haskell is also named for Haskell Curry. File that under useless trivia.) Now just apply this transformation everywhere and we get our final version.

// The dreaded Y-combinator in action!
function (builder) { return function (n) {
  return builder(builder)(n);
}}(
  function (recurse) { return function (n) {
    if (0 == n)
      return 1;
    else
      return n * recurse(recurse)(n - 1);
  }})(
  5
);

Feel free to try it. alert() that return, tie it to a button, whatever. That code calculates factorials, recursively, without using assignment, declarations, or functions of 2 variables. (But trying to trace how it works is likely to make your head spin. And handing it, without the derivation, just slightly reformatted will result in code that is sure to baffle and confuse.)

You can replace the 4 lines that recursively define factorial with any other recursive function that you want.


A Y-combinator is a "functional" (a function that operates on other functions) that enables recursion, when you can't refer to the function from within itself. In computer-science theory, it generalizes recursion, abstracting its implementation, and thereby separating it from the actual work of the function in question. The benefit of not needing a compile-time name for the recursive function is sort of a bonus. =)

This is applicable in languages that support lambda functions. The expression-based nature of lambdas usually means that they cannot refer to themselves by name. And working around this by way of declaring the variable, refering to it, then assigning the lambda to it, to complete the self-reference loop, is brittle. The lambda variable can be copied, and the original variable re-assigned, which breaks the self-reference.

Y-combinators are cumbersome to implement, and often to use, in static-typed languages (which procedural languages often are), because usually typing restrictions require the number of arguments for the function in question to be known at compile time. This means that a y-combinator must be written for any argument count that one needs to use.

Below is an example of how the usage and working of a Y-Combinator, in C#.

Using a Y-combinator involves an "unusual" way of constructing a recursive function. First you must write your function as a piece of code that calls a pre-existing function, rather than itself:

// Factorial, if func does the same thing as this bit of code...
x == 0 ? 1: x * func(x - 1);

Then you turn that into a function that takes a function to call, and returns a function that does so. This is called a functional, because it takes one function, and performs an operation with it that results in another function.

// A function that creates a factorial, but only if you pass in
// a function that does what the inner function is doing.
Func<Func<Double, Double>, Func<Double, Double>> fact =
  (recurs) =>
    (x) =>
      x == 0 ? 1 : x * recurs(x - 1);

Now you have a function that takes a function, and returns another function that sort of looks like a factorial, but instead of calling itself, it calls the argument passed into the outer function. How do you make this the factorial? Pass the inner function to itself. The Y-Combinator does that, by being a function with a permanent name, which can introduce the recursion.

// One-argument Y-Combinator.
public static Func<T, TResult> Y<T, TResult>(Func<Func<T, TResult>, Func<T, TResult>> F)
{
  return
    t =>  // A function that...
      F(  // Calls the factorial creator, passing in...
        Y(F)  // The result of this same Y-combinator function call...
              // (Here is where the recursion is introduced.)
        )
      (t); // And passes the argument into the work function.
}

Rather than the factorial calling itself, what happens is that the factorial calls the factorial generator (returned by the recursive call to Y-Combinator). And depending on the current value of t the function returned from the generator will either call the generator again, with t - 1, or just return 1, terminating the recursion.

It's complicated and cryptic, but it all shakes out at run-time, and the key to its working is "deferred execution", and the breaking up of the recursion to span two functions. The inner F is passed as an argument, to be called in the next iteration, only if necessary.


If you're ready for a long read, Mike Vanier has a great explanation. Long story short, it allows you to implement recursion in a language that doesn't necessarily support it natively.