The goal of this post is to jot down a few notes about the Y combinator, explaining how it works without getting too much into lambda-calculus theory. I'll be using Clojure and Python as the demonstration languages.
The idea is to build up intuition for the Y combinator from simple examples in a way that makes understanding it a sequences of small mental leaps rather than one large one.
Recursion with named functions
It wouldn't be a proper article about recursion if it didn't start with a factorial. Here's a fairly run-of-the-mill implementation in Clojure:
(defn factorial-rec [n] (if (zero? n) 1 (* n (factorial-rec (- n 1)))))
Recursion is accomplished by invoking the function, by name, within itself. Herein begins the thought experiment that will lead us to the Y combinator. Imagine that we're using a language where functions have no names - they're all anonymous. We can assign anonymous functions to symbols, but those symbols aren't visible or usable from within the function's body.
As an example of what I'm talking about, here is a non-recursive implementation of factorial in Clojure:
(def factorial-loop (fn [n] (loop [i n answer 1] (if (zero? i) answer (recur (- i 1) (* answer i))))))
Note how this is defined: we assign an anonymous function (lambda in Lisp/Scheme/Python parlance, fn in Clojure) to the symbol factorial-loop. This anonymous function computes the factorial of its parameter, and we can call it as follows:
ycombinator.core=> (factorial-loop 6) 720
To emphasize that factorial-loop is just a convenience symbol and plays no role in the implementation, we can forego it for a slightly more convoluted invocation:
ycombinator.core=> ((fn [n] #_=> (loop [i n answer 1] #_=> (if (zero? i) #_=> answer #_=> (recur (- i 1) (* answer i))))) 6) 720
No names in sight - we just invoke the anonymous function directly. But this implementation of factorial isn't recursive, so we don't really need to refer to the function's name from within its body. What if we do want to use recursion? This brings us back to the thought experiment.
Recursion with anonymous functions
It turns out this is absolutely possible by using some ingenuity and cranking the abstraction level up one notch. In our original factorial-rec, at the point where the function invokes itself all we need is an object that implements factorial, right? In factorial-rec we're using the fact that the symbol factorial-rec is bound to such an object (by the nature of defn). But we can't rely on that in our thought experiment. How else can we get access to such an object? Well, we can take it as a parameter... Here's how:
(def factorial-maker (fn [self] (fn [n] (if (zero? n) 1 (* n ((self self) (- n 1)))))))
And now we can compute factorials as follows:
ycombinator.core=> ((factorial-maker factorial-maker) 6) 720
A few things to note:
- factorial-maker is not computing a factorial. It creates an (anonymous) function that computes a factorial. It expects to be passed itself as a parameter.
- The expression (factorial-maker factorial-maker) does precisely that. It invokes factorial-maker and passes it itself as a parameter. The result of that is a function that computes a factorial, which we then apply to 6.
- The recursion inside the factorial is replaced by (self self); when the function created by (factorial-maker factorial-maker) runs for the first time, self is assigned to factorial-maker, so (self self) is (factorial-maker factorial maker). This is equivalent to the first call - recursion!
You may still feel uncomfortable about the def and the name factorial-maker. Aren't we just cheating? Nope, because we can do the same expansion as we did with factorial-loop; we don't need that def. Here's how it would look:
ycombinator.core=> (((fn [self] #_=> (fn [n] #_=> (if (zero? n) #_=> 1 #_=> (* n ((self self) (- n 1)))))) #_=> (fn [self] #_=> (fn [n] #_=> (if (zero? n) #_=> 1 #_=> (* n ((self self) (- n 1))))))) 6) 720
Pretty it is not... But hey, we've now implemented a recursive factorial function, without a single name in sight. How cool is that?
Understanding the example above is about 80% of the way to understanding the Y combinator, so make sure to spend the time required to thoroughly grok how it works. Tracing through the execution for 2-3 calls while drawing the "environments" (call frames) in action helps a lot.
To get a better feel of the direction we're taking, here's another recursive function that's slightly more complex than the factorial:
(defn tree-sum-rec [t] (if (nil? t) 0 (let [[nodeval left right] t] (+ nodeval (tree-sum-rec left) (tree-sum-rec right)))))
Given a binary tree represented as a list-of-lists with numbers for node deta, this function computes the sum of all the nodes in the tree. For example:
ycombinator.core=> (def t1 '(1 (2) (4 (3) (7)))) #'ycombinator.core/t1 ycombinator.core=> (tree-sum-rec t1) 17
We can rewrite it without using any symbol names within the function as follows:
(def tree-sum-maker (fn [self] (fn [t] (if (nil? t) 0 (let [[nodeval left right] t] (+ nodeval ((self self) left) ((self self) right)))))))
And invoke it as follows:
ycombinator.core=> ((tree-sum-maker tree-sum-maker) t1) 17
Note the similarities between tree-sum-maker and factorial-maker. They are transformed very similarly to synthesize the unnamed from the named-recursion variant. The recipe seems to be:
- Instead of a function taking a parameter, create a function factory that accepts itself as the self parameter, and returns the actual computation function.
- In every place where we'd previously call ourselves, call (self self) instead.
- The initial invocation of (foo param) is replaced by ((foo-maker foo-maker) param).
Y combinator - a tool for making anonymous functions recursive
Since there is a clear pattern here, we should be able to abstract it away and provide some method that transforms a given named-recursive function into an unnamed variant. This is precisely what the Y combinator does, though the nature of the problem makes it somewhat obscure at first sight:
(def Ycombinator (fn [func] ((fn [self] (func (fn [n] ((self self) n)))) (fn [self] (func (fn [n] ((self self) n)))))))
I'll explain how it works shortly, but first let's see how we use it. We have to write our factorial as follows:
(def factorial-rec* (fn [recurse] (fn [n] (if (zero? n) 1 (* n (recurse (- n 1)))))))
Note the superficial similarity to the factorial-maker version. factorial-rec* also takes a function and returns the actual function computing the factorial, though in this case I don't call the function parameter self (it's not self in the strict sense, as we'll soon see). We can convert this function to a recursive computation of the factorial by invoking the Y combinator on it:
ycombinator.core=> ((Ycombinator factorial-rec*) 6) 720
It's easiest to understand how Ycombinator does its magic by unraveling this invocation step by step. Similarly to how we did earlier, we can get rid of the Ycombinator name and just apply the object it's defined to be directly:
(((fn [func] ((fn [self] (func (fn [n] ((self self) n)))) (fn [self] (func (fn [n] ((self self) n)))))) factorial-rec*) 6)
As before, this does two things:
- Call the Y combinator (just a scary-looking anonymous function) on factorial-rec*.
- Call the result of (1) on 6.
If you look carefully at step 1, it invokes the following anonymous function:
(fn [self] (func (fn [n] ((self self) n))))
On itself, with func bound to factorial-rec*. So what we get is:
(((fn [self] (factorial-rec* (fn [n] ((self self) n)))) (fn [self] (factorial-rec* (fn [n] ((self self) n))))) 6)
And if we actually perform the call:
((factorial-rec* (fn [n] (((fn [self] (factorial-rec* (fn [n] ((self self) n)))) (fn [self] (factorial-rec* (fn [n] ((self self) n))))) n))) 6)
This calls factorial-rec*, passing it an anonymous function as recurse . factorial-rec* returns a factorial-computing function. This is where the first step ends. Invoking this factorial-computing function on 6 is the second step.
It should now be obvious what's going on. When the invocation with 6 happens and the program gets to calling recurse, it calls the parameter of factorial-rec* as shown above. But we've already unwrapped this call before - it... recurses into factorial-rec*, while propagating itself forward so that the recurse parameter is always bound properly. It's just the same trick as was employed by factorial-maker earlier in the post.
So, the Y combinator is the magic sauce that lets us take code like factorial-rec and convert it into code like factorial-maker. Here's how we can implement an unnamed version of tree-sum-rec:
(def tree-sum-rec* (fn [recurse] (fn [t] (if (nil? t) 0 (let [[nodeval left right] t] (+ nodeval (recurse left) (recurse right)))))))
And using it with the Y combinator:
ycombinator.core=> ((Ycombinator tree-sum-rec*) t1) 17
Here is an alternative formulation of the Y combinator that can make it a bit easier to understand. In this version I'm using named Clojure functions for further simplification (since many folks find the syntax of anonymous functions applied to other anonymous functions too cryptic):
(defn apply-to-self [func] (func func)) (defn Ycombinator-alt [func] (apply-to-self (fn [self] (func (fn [n] ((self self) n))))))
The Y combinator in Python
Finally, just to show that the Y combinator isn't something unique to the Lisp family of languages, here's a Python implementation:
ycombinator = lambda func: \ (lambda self: func(lambda n: (self(self))(n)))( lambda self: func(lambda n: (self(self))(n))) factorial = lambda recurse: \ lambda n: \ 1 if n == 0 else n * recurse(n - 1)
And we can invoke it as follows:
>>> (ycombinator(factorial))(6) 720
There's no real difference between the Python and the Clojure versions. As long as the language supports creating anonymous functions and treats them as first-class citizens, all is good.
It's even possible to create the Y combinator in C++. Static typing makes it somewhat less elegant than in the more dynamic languages, but C++14's generic lambdas help a lot. Take a look at Rosetta Code for an example.
Incidentally, note that by starting with (Ycombinator factorial-rec*), we now got to (factorial-rec* (Ycombinator factorial-rec*)). For this reason, the Y combinator is a fixed-point combinator in lambda calculus.
There's another interesting thing to note here - the equivalence mentioned above is imperfect. The call (Ycombinator factorial-rec*) results in a delayed fixed point equivalence (the delay achieved by means of wrapping the result in a fn). This is because we're using Clojure - an eagerly evaluated language. This version of the Y combinator is called the applicative-order Y combinator. Without the delay, we'd get an infinite loop. In lazily evaluated languages, it's possible to define the Y combinator somewhat more succinctly.
All of this is very interesting, but I'm deliberately avoiding getting too deep into lambda calculus and programming language theory in this post; I may write more about it some time in the future.