A higher-order function is a function that takes other functions as arguments, or returns a function as its result. Higher-order functions are an exceptionally powerful software design tool because they can easily create new abstractions and are composable. In this post I will present a case study - a set of functions that defines an interesting problem domain. By reading and understanding this code, hopefully anyone can appreciate the power and beauty of higher-order functions and how they enable constructing powerful abstractions from basic building blocks.

One of my all-time favorite programming books is Peter Norvig's PAIP . In section 6.4 - A set of Searching Tools, it presents some code for defining different variants of tree searching that I've always found very elegant.

Here's a quick reimplementation of the main idea in Clojure (see this repository for the full, runnable code); I'm using Clojure since it's a modern Lisp that I enjoy learning and using from time to time.

First, some prerequisites. As is often the case in dynamically-typed Lisp, entities can be described in a very abstract way. The code presented here searches trees, but there is no tree data structure per-se; it's defined using functions. Specifically, there's a notion of a "state" (tree node) and a way to get from a given state to its children states (successors); a function maps between the two.

In our case let's have integers as states; then, an infinite binary tree can be defined using the following successor function:

(defn binary-tree
  "A successors function representing a binary tree."
  (list (* 2 x) (+ 1 (* 2 x))))

Given a state (a number), it returns its children as a list. Simplistically, in this tree, node N has the children 2N and 2N+1.

Here are the first few layers of such a tree:

Binary tree with 15 nodes 1-15

In one sense, the tree is infinite because binary-tree will happily return the successors for any node we ask:

paip.core=> (binary-tree 9999)
(19998 19999)

But in another sense, there is no tree. This is a beautiful implication of using functions instead of concrete data - they easily enable lazy evaluation. We cannot materialize an infinite tree inside a necessarily finite computer, but we can operate on it all the same because of this abstraction. As far as the search algorithm is concerned, there exists an abstract state space and we tell it how to navigate and interpret it.

Now we're ready to look at the generic search function:

(defn tree-search
  "Finds a state that satisfies goal?-fn; Starts with states, and searches
  according to successors and combiner. If successful, returns the state;
  otherwise returns nil."
  [states goal?-fn successors combiner]
  (cond (empty? states) nil
        (goal?-fn (first states)) (first states)
        :else (tree-search (combiner (successors (first states))
                                     (rest states))

Let's dig in. The function accepts the following parameters:

  • states: a list of starting states for the search. When invoked by the user, this list will typically have a single element; when tree-search calls itself, this list is the states that it plans to explore next.
  • goal?-fn: a goal detection function. The search doesn't know anything about states and what the goal of the search is, so this is parameterized by a function. goal?-fn is expected to return true for a goal state (the state we were searching for) and false for all other states.
  • successors: the search function also doesn't know anything about what kind of tree it's searching through; what are the children of a given state? Is it searching a binary tree? A N-nary tree? Something more exotic? All of this is parameterized via the successors function provided by the user.
  • combiner: finally, the search strategy can be parameterized as well. There are many different kinds of searches possible - BFS, DFS and others. combiner takes a list of successors for the current state the search is looking at, as well as a list of all the other states the search still plans to look at. It combines these into a single list somehow, and thus guides the order in which the search happens.

Even before we see how this function is used, it's already apparent that this is quite a powerful abstraction. tree-search defines the essence of what it means to "search a tree", while being oblivious to what the tree contains, how it's structured and even what order it should be searched in; all of this is supplied by functions passed in as parameters.

Let's see an example, doing a BFS search on our infinite binary tree. First, we define a breadth-first-search function:

(defn breadth-first-search
  "Search old states first until goal is reached."
  [start goal?-fn successors]
  (tree-search (list start) goal?-fn successors prepend))

This function takes a start state (a single state, not a list), goal?-fn and successors, but it sets the combiner parameter to the prepend function, which is defined as follows:

(defn prepend
  [x y]
  (concat y x))

It defines the search strategy (BFS = first look at the rest of the states and only then at successors of the current state), but still leaves the tree structure and the notion of what a goal is to parameters. Let's see it in action:

paip.core=> (breadth-first-search 1 #(= % 9) binary-tree)

Here we pass the anonymous function literal #(= % 9) as the goal?-fn parameter. This function simply checks whether the state passed to it is the number 9. We also pass binary-tree as the successors, since we're going to be searching in our infinite binary tree. BFS works layer by layer, so it has no issue with that and finds the state quickly.

We can turn on verbosity (refer to the full code to see how it works) to see what states parameter tree-search gets called with, observing the progression of the search:

paip.core=> (with-verbose (breadth-first-search 1 #(= % 9) binary-tree))
;; Search: (1)
;; Search: (2 3)
;; Search: (3 4 5)
;; Search: (4 5 6 7)
;; Search: (5 6 7 8 9)
;; Search: (6 7 8 9 10 11)
;; Search: (7 8 9 10 11 12 13)
;; Search: (8 9 10 11 12 13 14 15)
;; Search: (9 10 11 12 13 14 15 16 17)

This is the prepend combiner in action; for example, after (3 4 5), the combiner prepends (4 5) to the successors of 3 (the list (6 7)), getting (4 5 6 7) as the set of states to search through. Overall, observing the first element in the states list through the printed lines, it's clear this is classical BFS where the tree is visited in "layers".

Implementing DFS using tree-search is similarly easy:

(defn depth-first-search
  "Search new states first until goal is reached."
  [start goal?-fn successors]
  (tree-search (list start) goal?-fn successors concat))

The only difference from BFS is the combiner parameter - here we use concat since we want to examine the successors of the first state before we examine the other states on the list. If we run depth-first-search on our infinite binary tree we'll get a stack overflow (unless we're looking for a state that's on the left-most path), so let's create a safer tree first. This function can serve as a successors to define a "finite" binary tree, with the given maximal state value:

(defn finite-binary-tree
  "Returns a successor function that generates a binary tree with n nodes."
  (fn [x]
    (filter #(<= % n) (binary-tree x))))

Note the clever use of higher-order functions here. finite-binary-tree is not a successors function itself - rather it's a generator of such functions; given a value, it creates a new function that acts as successors but limits the the states' value to n.

For example, (finite-binary-tree 15) will create a successors function that represents exactly the binary tree on the diagram above; if we ask it about successors of states on the fourth layer, it will say there are none:

paip.core=> (def f15 (finite-binary-tree 15))
paip.core=> (f15 4)
(8 9)
paip.core=> (f15 8)
paip.core=> (f15 7)
(14 15)
paip.core=> (f15 15)

As another test, let's try to look for a state that's not in our finite tree. Out infinite tree theoretically has all the states:

paip.core=> (breadth-first-search 1 #(= % 33) binary-tree)

But not the finite tree:

paip.core=> (breadth-first-search 1 #(= % 33) (finite-binary-tree 15))

With our finite tree, we are ready to use depth-first-search:

paip.core=> (with-verbose (depth-first-search 1 #(= % 9) (finite-binary-tree 15)))
;; Search: (1)
;; Search: (2 3)
;; Search: (4 5 3)
;; Search: (8 9 5 3)
;; Search: (9 5 3)

Note the search order; when (2 3) is explored, 2's successors (4 5) then come before 3 in the next call; this is the definition of DFS.

We can implement more advanced search strategies using this infrastructure. For example, suppose we have a heuristic that tells us which states to prioritize in order to get to the goal faster (akin to A* search on graphs). We can define a best-first-search that sorts the states according to our heuristic and tries the most promising states first ("best" as in "best looking among the current candidates", not as in "globally best").

First, let's define a couple of helper higher-order functions:

(defn diff
  "Given n, returns a function that computes the distance of its argument from n."
  (fn [x] (Math/abs (- x n))))

(defn sorter
  "Returns a combiner function that sorts according to cost-fn."
  (fn [new old]
    (sort-by cost-fn (concat new old))))

diff is a function generator like finite-binary-tree; it takes a target number n and returns a function that computes its parameter x's distance from n.

sorter returns a function that serves as the combiner for our search, based on a cost function. This is done by concatenating the two lists (successors of first state and the rest of the states) first, and then sorting them by the cost function. sorter is a powerful example of modeling with higher-order functions.

With these building blocks in place, we can define best-first-search:

(defn best-first-search
  "Search lowest cost states first until goal is reached."
  [start goal?-fn successors cost-fn]
  (tree-search (list start) goal?-fn successors (sorter cost-fn)))

Once again, this is just like the earlier BFS and DFS - only the strategy (combiner) changes. Let's use it to find 9 again:

paip.core=> (with-verbose (best-first-search 1 #(= % 9) (finite-binary-tree 15) (diff 9)))
;; Search: (1)
;; Search: (3 2)
;; Search: (7 6 2)
;; Search: (6 14 15 2)
;; Search: (12 13 14 15 2)
;; Search: (13 14 15 2)
;; Search: (14 15 2)
;; Search: (15 2)
;; Search: (2)
;; Search: (5 4)
;; Search: (10 11 4)
;; Search: (11 4)
;; Search: (4)
;; Search: (9 8)

While it finds the state eventually, we discover that our heuristic is not a great match for this problem, as it sends the search astray. The goal of this post is to demonstrate the power of higher-order functions in building modular code, not to discover an optimal heuristic for searching in binary trees, though :-)

One last search variant before we're ready to wrap up. As we've seen with the infinite tree, sometimes the search space is too large and we have to compromise on which states to look at and which to ignore. This technique works particularly well if the target is not some single value that we must find, but rather we want to get a "good enough" result in a sea of bad options. We can use a technique called beam search; think of a beam of light a flashlight produces in a very dark room; we can see what the beam points at, but not much else.

Beam search is somewhat similar to our best-first-search, but after combining and sorting the list of states to explore, it only keeps the first N, where N is given by the beam-width parameter:

(defn beam-search
  "Search highest scoring states first until goal is reached, but never consider
  more than beam-width states at a time."
  [start goal?-fn successors cost-fn beam-width]
  (tree-search (list start) goal?-fn successors
               (fn [old new]
                 (let [sorted ((sorter cost-fn) old new)]
                   (take beam-width sorted)))))

Once again, higher-order functions at play: as its combiner, beam-search creates an anonymous function that sorts the list based on cost-fn, and then keeps only the first beam-width states on that list.

Exercise: Try to run it - what beam width do you need to set in order to successfully find 9 using our cost heuristic? How can this be improved?


This post attempts a code-walkthrough approach to demonstrating the power of higher-order functions. I always found this particular example from PAIP very elegant; a particularly powerful insight is the distilled difference between DFS and BFS. While most programmers intuitively understand the difference and could write down the pseudo-code for both search strategies, modeling the problem with higher-order functions lets us really get to the essence of the difference - concat vs. prepend as the combiner step.

See also: this code sample ported to Go.