A *higher-order function* is a function that takes other functions as arguments,
or returns a function as its result. Higher-order functions are an exceptionally
powerful software design tool because they can easily create new abstractions
and are composable. In this post I will present a case study - a set of
functions that defines an interesting problem domain. By reading and
understanding this code, hopefully anyone can appreciate the power and beauty of
higher-order functions and how they enable constructing powerful abstractions
from basic building blocks.

One of my all-time favorite programming
books is Peter Norvig's PAIP . In
section *6.4 - A set of Searching Tools*, it presents some code for defining
different variants of tree searching that I've always found very elegant.

Here's a quick reimplementation of the main idea in Clojure (see this repository for the full, runnable code); I'm using Clojure since it's a modern Lisp that I enjoy learning and using from time to time.

First, some prerequisites. As is often the case in dynamically-typed Lisp, entities can be described in a very abstract way. The code presented here searches trees, but there is no tree data structure per-se; it's defined using functions. Specifically, there's a notion of a "state" (tree node) and a way to get from a given state to its children states (successors); a function maps between the two.

In our case let's have integers as states; then, an infinite binary tree can be defined using the following successor function:

```
(defn binary-tree
"A successors function representing a binary tree."
[x]
(list (* 2 x) (+ 1 (* 2 x))))
```

Given a state (a number), it returns its children as a list. Simplistically, in this tree, node N has the children 2N and 2N+1.

Here are the first few layers of such a tree:

In one sense, the tree is infinite because `binary-tree` will happily return
the successors for any node we ask:

```
paip.core=> (binary-tree 9999)
(19998 19999)
```

But in another sense, *there is no tree*. This is a beautiful implication
of using functions instead of concrete data - they easily enable *lazy
evaluation*. We cannot materialize an infinite tree inside a
necessarily finite computer, but we can operate on it all the same because of
this abstraction. As far as the search algorithm is concerned, there exists an
abstract *state space* and we tell it how to navigate and interpret it.

Now we're ready to look at the generic search function:

```
(defn tree-search
"Finds a state that satisfies goal?-fn; Starts with states, and searches
according to successors and combiner. If successful, returns the state;
otherwise returns nil."
[states goal?-fn successors combiner]
(cond (empty? states) nil
(goal?-fn (first states)) (first states)
:else (tree-search (combiner (successors (first states))
(rest states))
goal?-fn
successors
combiner)))
```

Let's dig in. The function accepts the following parameters:

`states`: a list of starting states for the search. When invoked by the user, this list will typically have a single element; when`tree-search`calls itself, this list is the states that it plans to explore next.`goal?-fn`: a goal detection function. The search doesn't know anything about states and what the goal of the search is, so this is parameterized by a function.`goal?-fn`is expected to return`true`for a goal state (the state we were searching for) and`false`for all other states.`successors`: the search function also doesn't know anything about what kind of tree it's searching through; what are the children of a given state? Is it searching a binary tree? A N-nary tree? Something more exotic? All of this is parameterized via the`successors`function provided by the user.`combiner`: finally, the search strategy can be parameterized as well. There are many different kinds of searches possible - BFS, DFS and others.`combiner`takes a list of successors for the current state the search is looking at, as well as a list of all the other states the search still plans to look at. It combines these into a single list*somehow*, and thus guides the order in which the search happens.

Even before we see how this function is used, it's already apparent that this
is quite a powerful abstraction. `tree-search` defines the essence of what
it means to "search a tree", while being oblivious to what the tree contains,
how it's structured and even what order it should be searched in; all of this
is supplied by functions passed in as parameters.

Let's see an example, doing a BFS search on our infinite binary tree. First,
we define a `breadth-first-search` function:

```
(defn breadth-first-search
"Search old states first until goal is reached."
[start goal?-fn successors]
(tree-search (list start) goal?-fn successors prepend))
```

This function takes a start state (a single state, not a list), `goal?-fn` and
`successors`, but it sets the `combiner` parameter to the `prepend`
function, which is defined as follows:

```
(defn prepend
[x y]
(concat y x))
```

It defines the search strategy (BFS = first look at the rest of the states and only then at successors of the current state), but still leaves the tree structure and the notion of what a goal is to parameters. Let's see it in action:

```
paip.core=> (breadth-first-search 1 #(= % 9) binary-tree)
9
```

Here we pass the anonymous function literal `#(= % 9)` as the `goal?-fn`
parameter. This function simply checks whether the state passed to it is the
number 9. We also pass `binary-tree` as the `successors`, since we're going
to be searching in our infinite binary tree. BFS works layer by layer, so it
has no issue with that and finds the state quickly.

We can turn on verbosity (refer to the full code to
see how it works) to see what `states` parameter `tree-search` gets called
with, observing the progression of the search:

```
paip.core=> (with-verbose (breadth-first-search 1 #(= % 9) binary-tree))
;; Search: (1)
;; Search: (2 3)
;; Search: (3 4 5)
;; Search: (4 5 6 7)
;; Search: (5 6 7 8 9)
;; Search: (6 7 8 9 10 11)
;; Search: (7 8 9 10 11 12 13)
;; Search: (8 9 10 11 12 13 14 15)
;; Search: (9 10 11 12 13 14 15 16 17)
9
```

This is the `prepend` combiner in action; for example, after `(3 4 5)`, the
combiner prepends `(4 5)` to the successors of 3 (the list `(6 7)`), getting
`(4 5 6 7)` as the set of states to search through. Overall, observing the
first element in the `states` list through the printed lines, it's clear this
is classical BFS where the tree is visited in "layers".

Implementing DFS using `tree-search` is similarly easy:

```
(defn depth-first-search
"Search new states first until goal is reached."
[start goal?-fn successors]
(tree-search (list start) goal?-fn successors concat))
```

The only difference from BFS is the `combiner` parameter - here we use
`concat` since we want to examine the successors of the first state *before*
we examine the other states on the list. If we run `depth-first-search` on our
infinite binary tree we'll get a stack overflow (unless we're looking for a
state that's on the left-most path), so let's create a safer tree first. This
function can serve as a `successors` to define a "finite" binary tree, with
the given maximal state value:

```
(defn finite-binary-tree
"Returns a successor function that generates a binary tree with n nodes."
[n]
(fn [x]
(filter #(<= % n) (binary-tree x))))
```

Note the clever use of higher-order functions here. `finite-binary-tree` is
not a `successors` function itself - rather it's a generator of such
functions; given a value, it creates a new function that acts as `successors`
but limits the the states' value to `n`.

For example, `(finite-binary-tree 15)` will create a `successors` function
that represents exactly the binary tree on the diagram above; if we ask it about
successors of states on the fourth layer, it will say there are none:

```
paip.core=> (def f15 (finite-binary-tree 15))
#'paip.core/f15
paip.core=> (f15 4)
(8 9)
paip.core=> (f15 8)
()
paip.core=> (f15 7)
(14 15)
paip.core=> (f15 15)
()
```

As another test, let's try to look for a state that's not in our finite tree.
Out infinite tree theoretically has *all* the states:

```
paip.core=> (breadth-first-search 1 #(= % 33) binary-tree)
33
```

But not the finite tree:

```
paip.core=> (breadth-first-search 1 #(= % 33) (finite-binary-tree 15))
nil
```

With our finite tree, we are ready to use `depth-first-search`:

```
paip.core=> (with-verbose (depth-first-search 1 #(= % 9) (finite-binary-tree 15)))
;; Search: (1)
;; Search: (2 3)
;; Search: (4 5 3)
;; Search: (8 9 5 3)
;; Search: (9 5 3)
9
```

Note the search order; when `(2 3)` is explored, 2's successors `(4 5)` then
come *before* 3 in the next call; this is the definition of DFS.

We can implement more advanced search strategies using this infrastructure. For
example, suppose we have a heuristic that tells us which states to prioritize in
order to get to the goal faster (akin to A* search on graphs). We can define
a `best-first-search` that sorts the states according to our heuristic and
tries the most promising states first ("best" as in "best looking among the
current candidates", not as in "globally best").

First, let's define a couple of helper higher-order functions:

```
(defn diff
"Given n, returns a function that computes the distance of its argument from n."
[n]
(fn [x] (Math/abs (- x n))))
(defn sorter
"Returns a combiner function that sorts according to cost-fn."
[cost-fn]
(fn [new old]
(sort-by cost-fn (concat new old))))
```

`diff` is a function generator like `finite-binary-tree`; it takes a target
number `n` and returns a function that computes its parameter `x`'s distance
from `n`.

`sorter` returns a function that serves as the `combiner` for our
search, based on a cost function. This is done by concatenating the two lists
(successors of first state and the rest of the states) first, and then sorting
them by the cost function. `sorter` is a powerful example of modeling with
higher-order functions.

With these building blocks in place, we can define `best-first-search`:

```
(defn best-first-search
"Search lowest cost states first until goal is reached."
[start goal?-fn successors cost-fn]
(tree-search (list start) goal?-fn successors (sorter cost-fn)))
```

Once again, this is just like the earlier BFS and DFS - only the strategy
(`combiner`) changes. Let's use it to find 9 again:

```
paip.core=> (with-verbose (best-first-search 1 #(= % 9) (finite-binary-tree 15) (diff 9)))
;; Search: (1)
;; Search: (3 2)
;; Search: (7 6 2)
;; Search: (6 14 15 2)
;; Search: (12 13 14 15 2)
;; Search: (13 14 15 2)
;; Search: (14 15 2)
;; Search: (15 2)
;; Search: (2)
;; Search: (5 4)
;; Search: (10 11 4)
;; Search: (11 4)
;; Search: (4)
;; Search: (9 8)
9
```

While it finds the state eventually, we discover that our heuristic is not a great match for this problem, as it sends the search astray. The goal of this post is to demonstrate the power of higher-order functions in building modular code, not to discover an optimal heuristic for searching in binary trees, though :-)

One last search variant before we're ready to wrap up. As we've seen with the
infinite tree, sometimes the search space is too large and we have to compromise
on which states to look at and which to ignore. This technique works
particularly well if the target is not some single value that we must find, but
rather we want to get a "good enough" result in a sea of bad options. We can
use a technique called *beam search*; think of a beam of light a flashlight
produces in a very dark room; we can see what the beam points at, but not much
else.

Beam search is somewhat similar to our `best-first-search`, but after combining
and sorting the list of states to explore, it only keeps the first N, where
N is given by the `beam-width` parameter:

```
(defn beam-search
"Search highest scoring states first until goal is reached, but never consider
more than beam-width states at a time."
[start goal?-fn successors cost-fn beam-width]
(tree-search (list start) goal?-fn successors
(fn [old new]
(let [sorted ((sorter cost-fn) old new)]
(take beam-width sorted)))))
```

Once again, higher-order functions at play: as its `combiner`, `beam-search`
creates an anonymous function that sorts the list based on `cost-fn`, and then
keeps only the first `beam-width` states on that list.

Exercise: Try to run it - what beam width do you need to set in order to successfully find 9 using our cost heuristic? How can this be improved?

## Conclusion

This post attempts a code-walkthrough approach to demonstrating the power of
higher-order functions. I always found this particular example from PAIP very
elegant; a particularly powerful insight is the distilled difference between
DFS and BFS. While most programmers intuitively understand the difference and
could write down the pseudo-code for both search strategies, modeling the
problem with higher-order functions lets us really get to the essence of the
difference - `concat` vs. `prepend` as the combiner step.