In the past I've written about how simple arithmetic can be employed to compute a powerset of a given set.

Here I want to show a generalization, that uses n-nary arithmetic. But first, let's define the problem:

Suppose you have a set of elements and you want to select multi-subsets from it. By multi-subset in this context I mean that an element can appear more than once in it. For example, given the set {0, 1, 2, 3, 4, 5}, then {1, 1, 2} is a multi-subset. So are {5, 5, 5, 5} and {0, 1, 2, 3, 4, 5}. Suppose you want to go over all multi-subsets of a set. How can this be done?

Note that generating a superset is a private case of this problem, restricting each element to appear either 0 or 1 times in the resulting subset.

So the solution is a generalization of the binary-arithmetic solution for the powerset problem.

Intuitive motivation: consider the decimal numbers, for example 25. If we use the position of each digit (starting with the units) to convey information, this leads to an interesting observation. If we have two elements to choose from, 25 may mean 5 times the 1st element, 2 times the second element. Now, going over all numbers from 0 to 99, we are actually generating all multi-subsets of two elements where each can be picked from 0 to 9 times.

Once this is clear, the algorithm is simple. Let's generalize to a n-ary base system, using position to point to an element and the 'digit' at this position to say how many times it appears in a given multi-subset. And the best part - the simple rules of addition with carry can now be used to efficiently generate all multi-subsets, given the amount of elements we have (length) and the maximal amount of times each can be picked (upto), the minimum being assumed 0.

Here's the code:

def multiselects(upto, length):
  # Arithmetically, we create an array of digits
  # (each in the range 0..upto).
  # It's initialized with '1'
  #
  ar = [1] + [0] * (length - 1)

  while True:
      yield ar

      # The index we're currently trying to
      # advance
      #
      idx = 0

      # Advance the current index. If it reaches
      # the limit (upto), pefrorm a carry to the
      # next index (digits)
      #
      while idx < length:
          ar[idx] += 1
          if ar[idx] <= upto:
              break
          else:
              ar[idx] = 0
              idx += 1

      # We've reached the last number...
      #
      if idx == length:
          break

An an example run of:

for s in multiselects(2, 3):
    print s

Produces:

[1, 0, 0]
[2, 0, 0]
[0, 1, 0]
[1, 1, 0]
[2, 1, 0]
[0, 2, 0]
[1, 2, 0]
[2, 2, 0]
[0, 0, 1]
[1, 0, 1]
[2, 0, 1]
[0, 1, 1]
[1, 1, 1]
[2, 1, 1]
[0, 2, 1]
[1, 2, 1]
[2, 2, 1]
[0, 0, 2]
[1, 0, 2]
[2, 0, 2]
[0, 1, 2]
[1, 1, 2]
[2, 1, 2]
[0, 2, 2]
[1, 2, 2]
[2, 2, 2]

Note that the solution is general, as the lists it returns are lists of indices. These can be employed with any set to generate multi-subsets.

Background and links

I came up with this function while working on Project Euler's problem 77. I ended up using a different method, but visualizing the possible partitions of primes was very useful.

Here are some interesting mathematical links related to this problem: