The goal of this post is to jot down a few notes about the Y combinator, explaining how it works without getting too much into lambda-calculus theory. I'll be using Clojure and Python as the demonstration languages.

The idea is to build up intuition for the Y combinator from simple examples in a way that makes understanding it a sequences of small mental leaps rather than one large one.

## Recursion with named functions

It wouldn't be a proper article about recursion if it didn't start with a factorial. Here's a fairly run-of-the-mill implementation in Clojure:

```
(defn factorial-rec
[n]
(if (zero? n)
1
(* n (factorial-rec (- n 1)))))
```

Recursion is accomplished by invoking the function, by name, within itself. Herein begins the thought experiment that will lead us to the Y combinator. Imagine that we're using a language where functions have no names - they're all anonymous. We can assign anonymous functions to symbols, but those symbols aren't visible or usable from within the function's body.

As an example of what I'm talking about, here is a non-recursive implementation of factorial in Clojure:

```
(def factorial-loop
(fn [n]
(loop [i n answer 1]
(if (zero? i)
answer
(recur (- i 1) (* answer i))))))
```

Note how this is defined: we assign an anonymous function (`lambda` in
Lisp/Scheme/Python parlance, `fn` in Clojure) to the symbol
`factorial-loop`. This anonymous function computes the factorial of its
parameter, and we can call it as follows:

```
ycombinator.core=> (factorial-loop 6)
720
```

To emphasize that `factorial-loop` is just a convenience symbol and plays no
role in the implementation, we can forego it for a slightly more convoluted
invocation:

```
ycombinator.core=> ((fn [n]
#_=> (loop [i n answer 1]
#_=> (if (zero? i)
#_=> answer
#_=> (recur (- i 1) (* answer i))))) 6)
720
```

No names in sight - we just invoke the anonymous function directly. But this
implementation of factorial isn't recursive, so we don't really *need* to refer
to the function's name from within its body. What if we do want to use
recursion? This brings us back to the thought experiment.

## Recursion with anonymous functions

It turns out this is absolutely possible by using some ingenuity and cranking
the abstraction level up one notch. In our original `factorial-rec`, at the
point where the function invokes itself all we need is an object that implements
factorial, right? In `factorial-rec` we're using the fact that the symbol
`factorial-rec` is bound to such an object (by the nature of `defn`). But
we can't rely on that in our thought experiment. How else can we get access to
such an object? Well, we can take it as a parameter... Here's how:

```
(def factorial-maker
(fn [self]
(fn [n]
(if (zero? n)
1
(* n ((self self) (- n 1)))))))
```

And now we can compute factorials as follows:

```
ycombinator.core=> ((factorial-maker factorial-maker) 6)
720
```

A few things to note:

`factorial-maker`is not computing a factorial. It creates an (anonymous) function that computes a factorial. It expects to be passed*itself*as a parameter.- The expression
`(factorial-maker factorial-maker)`does precisely that. It invokes`factorial-maker`and passes it itself as a parameter. The result of that is a function that computes a factorial, which we then apply to 6. - The recursion inside the factorial is replaced by
`(self self)`; when the function created by`(factorial-maker factorial-maker)`runs for the first time,`self`is assigned to`factorial-maker`, so`(self self)`is`(factorial-maker factorial maker)`. This is equivalent to the first call - recursion!

You may still feel uncomfortable about the `def` and the name
`factorial-maker`. Aren't we just cheating? Nope, because we can do the same
expansion as we did with `factorial-loop`; we don't need that `def`. Here's
how it would look:

```
ycombinator.core=> (((fn [self]
#_=> (fn [n]
#_=> (if (zero? n)
#_=> 1
#_=> (* n ((self self) (- n 1))))))
#_=> (fn [self]
#_=> (fn [n]
#_=> (if (zero? n)
#_=> 1
#_=> (* n ((self self) (- n 1))))))) 6)
720
```

Pretty it is not... But hey, we've now implemented a recursive factorial function, without a single name in sight. How cool is that?

Understanding the example above is about 80% of the way to understanding the Y combinator, so make sure to spend the time required to thoroughly grok how it works. Tracing through the execution for 2-3 calls while drawing the "environments" (call frames) in action helps a lot.

To get a better feel of the direction we're taking, here's another recursive function that's slightly more complex than the factorial:

```
(defn tree-sum-rec
[t]
(if (nil? t)
0
(let [[nodeval left right] t]
(+ nodeval
(tree-sum-rec left)
(tree-sum-rec right)))))
```

Given a binary tree represented as a list-of-lists with numbers for node deta, this function computes the sum of all the nodes in the tree. For example:

```
ycombinator.core=> (def t1 '(1 (2) (4 (3) (7))))
#'ycombinator.core/t1
ycombinator.core=> (tree-sum-rec t1)
17
```

We can rewrite it without using any symbol names within the function as follows:

```
(def tree-sum-maker
(fn [self]
(fn [t]
(if (nil? t)
0
(let [[nodeval left right] t]
(+ nodeval
((self self) left)
((self self) right)))))))
```

And invoke it as follows:

```
ycombinator.core=> ((tree-sum-maker tree-sum-maker) t1)
17
```

Note the similarities between `tree-sum-maker` and `factorial-maker`. They
are transformed very similarly to synthesize the unnamed from the
named-recursion variant. The recipe seems to be:

- Instead of a function taking a parameter, create a function factory that
accepts itself as the
`self`parameter, and returns the actual computation function. - In every place where we'd previously call ourselves, call
`(self self)`instead. - The initial invocation of
`(foo param)`is replaced by`((foo-maker foo-maker) param)`.

## Y combinator - a tool for making anonymous functions recursive

Since there is a clear pattern here, we should be able to abstract it away and provide some method that transforms a given named-recursive function into an unnamed variant. This is precisely what the Y combinator does, though the nature of the problem makes it somewhat obscure at first sight:

```
(def Ycombinator
(fn [func]
((fn [self]
(func (fn [n] ((self self) n))))
(fn [self]
(func (fn [n] ((self self) n)))))))
```

I'll explain how it works shortly, but first let's see how we use it. We have
to write our `factorial` as follows:

```
(def factorial-rec*
(fn [recurse]
(fn [n]
(if (zero? n)
1
(* n (recurse (- n 1)))))))
```

Note the superficial similarity to the `factorial-maker` version.
`factorial-rec*` also takes a function and returns the actual function
computing the factorial, though in this case I don't call the function parameter
`self` (it's not `self` in the strict sense, as we'll soon see). We can
convert this function to a recursive computation of the factorial by invoking
the Y combinator on it:

```
ycombinator.core=> ((Ycombinator factorial-rec*) 6)
720
```

It's easiest to understand how `Ycombinator` does its magic by unraveling this
invocation step by step. Similarly to how we did earlier, we can get rid of the
`Ycombinator` name and just apply the object it's defined to be directly:

```
(((fn [func]
((fn [self]
(func (fn [n] ((self self) n))))
(fn [self]
(func (fn [n] ((self self) n))))))
factorial-rec*)
6)
```

As before, this does two things:

- Call the Y combinator (just a scary-looking anonymous function)
on
`factorial-rec*`. - Call the result of (1) on 6.

If you look carefully at step 1, it invokes the following anonymous function:

```
(fn [self]
(func (fn [n] ((self self) n))))
```

On itself, with `func` bound to `factorial-rec*`. So what we get is:

```
(((fn [self]
(factorial-rec* (fn [n] ((self self) n))))
(fn [self]
(factorial-rec* (fn [n] ((self self) n)))))
6)
```

And if we actually perform the call:

```
((factorial-rec* (fn [n] (((fn [self]
(factorial-rec* (fn [n] ((self self) n))))
(fn [self]
(factorial-rec* (fn [n] ((self self) n)))))
n)))
6)
```

This calls `factorial-rec*`, passing it an anonymous function as `recurse`
[1]. `factorial-rec*` returns a factorial-computing function. This is where
the first step ends. Invoking this factorial-computing function on 6 is the
second step.

It should now be obvious what's going on. When the invocation with 6 happens and
the program gets to calling `recurse`, it calls the parameter of
`factorial-rec*` as shown above. But we've already unwrapped this call before
- it... recurses into `factorial-rec*`, while propagating itself forward
so that the `recurse` parameter is always bound properly. It's just the same
trick as was employed by `factorial-maker` earlier in the post.

So, the Y combinator is the magic sauce that lets us take code like
`factorial-rec` and convert it into code like `factorial-maker`. Here's
how we can implement an unnamed version of `tree-sum-rec`:

```
(def tree-sum-rec*
(fn [recurse]
(fn [t]
(if (nil? t)
0
(let [[nodeval left right] t]
(+ nodeval
(recurse left)
(recurse right)))))))
```

And using it with the Y combinator:

```
ycombinator.core=> ((Ycombinator tree-sum-rec*) t1)
17
```

Here is an alternative formulation of the Y combinator that can make it a bit easier to understand. In this version I'm using named Clojure functions for further simplification (since many folks find the syntax of anonymous functions applied to other anonymous functions too cryptic):

```
(defn apply-to-self [func] (func func))
(defn Ycombinator-alt
[func]
(apply-to-self
(fn [self]
(func (fn [n] ((self self) n))))))
```

## The Y combinator in Python

Finally, just to show that the Y combinator isn't something unique to the Lisp family of languages, here's a Python implementation:

```
ycombinator = lambda func: \
(lambda self: func(lambda n: (self(self))(n)))(
lambda self: func(lambda n: (self(self))(n)))
factorial = lambda recurse: \
lambda n: \
1 if n == 0 else n * recurse(n - 1)
```

And we can invoke it as follows:

```
>>> (ycombinator(factorial))(6)
720
```

There's no real difference between the Python and the Clojure versions. As long as the language supports creating anonymous functions and treats them as first-class citizens, all is good.

It's even possible to create the Y combinator in C++. Static typing makes it somewhat less elegant than in the more dynamic languages, but C++14's generic lambdas help a lot. Take a look at Rosetta Stone for an example.

[1] | Incidentally, note that by starting with There's another interesting thing to note here - the equivalence
mentioned above is imperfect. The call All of this is very interesting, but I'm deliberately avoiding getting too deep into lambda calculus and programming language theory in this post; I may write more about it some time in the future. |