I'm reading Programming Pearls by Jon Bentley now. It's a really nice book, full of interesting insights and challenges in programming [1].

One curious challenge that caught my attention is the following (in chapter 2):

Rotate a one-dimensional vector of n elements left by i positions. For instance, with n=8 and i=3, the vector abcdefgh is rotated to defghabc. Simple code uses an n-element intermediate vector to do the job in n steps. Can you rotate the vector in time proportional to n using only a few dozen extra bytes of storage ?

The "catch" here is the constant space requirement. Even if the vector is one million elements long and you have to rotate it left by 50,000 - you can only use a very small constant space. Also, since the time must be O(n), you can't simply rotate the vector one element at a time (since the complexity of this method is O(i*n)).

The author presents three algorithms to solve the problem. Two are clever, and the third brilliant in its simplicity.

Juggling

We'll use a temporary location t and move all the vector elements left in hops of i, as follows: move x[0] to t, then move x[i] to x[0], x[2i] to x[i], and so on (the indices into x are taken modulo n). Eventually we'll come back to x[0], at which point we instead take the element from t and end the process. If that process didn't move all the elements, then we start over at x[1] and repeat, until we move all the elements.

Now, are you asking yourself "Just when will the process come back to x[0], and how many elements will have been moved by that stage ?". So did I, and the answer turns out to be an exciting application of the greatest common divisor (GCD) function.

In the first iteration, the "juggling pointer" jumps i elements at a time and stops when it reaches 0. How many steps will this take ? Ignoring the modulo for a moment, To reach 0, the pointer must be a multiple of n, so 0 will be reached at an index that is a multiple of both i and n. The first such multiple, in fact.

The first multiple (also known as LCM - least common multiple) is easy to compute from the GCD.

The amount of steps is lcm(n, i) / i. This is n / gcd(n, i). Therefore, in each iteration, n / gcd(n, i) elements will be rotated. The next iteration will pick up at x[1], an keep hopping in steps of i from there, moving another n / gcd(n, i) elements. In special cases, like when n and i are coprime, the first iteration will run over all the elements in the vector, without the need for a second one.

In any case, the whole process will always make n steps in total, moving all the elements to their correct positions.

Here's an implementation of this algorithm in Python:

def gcd(a, b):
    """ Greatest common divisor of a and b
        Using Euclid's algorithm
    """
    while b:
        a, b = b, a % b
    return a


def rotate_juggle(lst, dist):
    """ An iterative 'juggle' method
    """
    n = len(lst)

    for i in xrange(gcd(dist, n)):
        t = lst[i]
        j = i
        while 1:
            k = (j + dist) % n
            if k == i: break
            lst[j] = lst[k]
            j = k
        lst[j] = t

Note that the list is rotated in-place, in order to avoid extra space allocation. This is true of all the algorithms presented in this article.

Swapping

Rotating the vector x is really just swapping the two segments of the vector ab to the vector ba, where a represents the first i elements of x. Suppose a is shorter than b. Divide b into b_r and b_l, so that b_r is the same length as a. Swap a and b_r to transform a.b_l.b_r into b_r.b_l.a. The sequence a is in its final place, so we can focus on swapping the two parts of b. Since this new problem has the same form as the original, we can solve it recursively.

Here's a Python transcription of the pseudo code given in the book:

def sublist_swap(lst, a, b, m):
    """ Swaps (in-place) the elements:
        lst[a:a+m) with lst[b:b+m)
        Without using extra space.

        Assumes that all the indices point inside the list.
    """
    for i in xrange(m):
        lst[a + i], lst[b + i] = lst[b + i], lst[a + i]


def rotate_swap(lst, dist):
    """ A 'recursive' sub-list swapping method.
    """
    n = len(lst)

    if dist == 0 or dist == n:
        return
    i = p = dist
    j = n - p

    while i != j:
        if i > j:
            sublist_swap(lst, p - i, p, j)
            i -= j
        else:
            sublist_swap(lst, p - i, p + j - i, i)
            j -= i

    sublist_swap(lst, p - i, p, i)

Reversing

Here comes the brilliantly simple solution. Recall the example in the problem description. We have to rotate the vector abcdefgh by 3. Let's view it as two sub-vectors, 3 and 5 items long: abc-defgn.

To rotate the vector, we reverse the first sub-vector to obtain cba-defgh, then reverse the second sub-vector to obtain cba-hgfed. Finally, reverse the whole thing to obtain defghabc. That's it !

Bentley provodes this algorithm as an example of an aha! solution to a problem, and I agree. I definitely felt the light bulb turning on suddenly when I read its description.

Here's an implementation:

def sublist_reverse(lst, a, b):
    """ Reverses (in-place) the elements lst[a:b]
    """
    while b > a:
        lst[a], lst[b] = lst[b], lst[a]
        b -= 1
        a += 1


def rotate_reverse(lst, dist):
    """ Uses reversing to rotate the list.
    """
    n = len(lst)
    sublist_reverse(lst, 0, dist - 1)
    sublist_reverse(lst, dist, n - 1)
    sublist_reverse(lst, 0, n - 1)

Comparing the algorithms' performance

I benchmarked the three algorithms together with a naive Pythonic solution that uses O(n) of extra space [2], on a list of size 1,000,000 with various rotation distances.

The naive algorithm proved to be the fastest. This makes sense, because the slice operations are implemented more efficiently in the Python code than explicit loops.

Conclusion

Sometimes it's very educational to look at a single problem from different angles. And aha! solutions like the reversal algorithm presented here are a pleasure to understand (and especially invent!).

The full source code (with benchmarking and some unit tests) can be downloaded here.

[1]Some paragraphs in this article are direct quotes from the book. I hope this will not be considered as plagiarism. All the rights for presenting these algorithms belong to Jon Bentley and the book's publisher.
[2]
def rotate_naive(lst, dist):
    """ A 'naive' (space inefficient) rotation function.
        The slice operations create new lists.
    """
    lst[:] = lst[dist:len(lst)] + lst[0:dist]