Generating multi-subsets using arithmetic
July 11th, 2009 at 7:27 amIn the past I’ve written about how simple arithmetic can be employed to compute a powerset of a given set.
Here I want to show a generalization, that uses n-nary arithmetic. But first, let’s define the problem:
Suppose you have a set of elements and you want to select multi-subsets from it. By multi-subset in this context I mean that an element can appear more than once in it. For example, given the set {0, 1, 2, 3, 4, 5}, then {1, 1, 2} is a multi-subset. So are {5, 5, 5, 5} and {0, 1, 2, 3, 4, 5}. Suppose you want to go over all multi-subsets of a set. How can this be done?
Note that generating a superset is a private case of this problem, restricting each element to appear either 0 or 1 times in the resulting subset.
So the solution is a generalization of the binary-arithmetic solution for the powerset problem.
Intuitive motivation: consider the decimal numbers, for example 25. If we use the position of each digit (starting with the units) to convey information, this leads to an interesting observation. If we have two elements to choose from, 25 may mean 5 times the 1st element, 2 times the second element. Now, going over all numbers from 0 to 99, we are actually generating all multi-subsets of two elements where each can be picked from 0 to 9 times.
Once this is clear, the algorithm is simple. Let’s generalize to a n-ary base system, using position to point to an element and the ‘digit’ at this position to say how many times it appears in a given multi-subset. And the best part – the simple rules of addition with carry can now be used to efficiently generate all multi-subsets, given the amount of elements we have (length) and the maximal amount of times each can be picked (upto), the minimum being assumed 0.
Here’s the code:
def multiselects(upto, length):
# Arithmetically, we create an array of digits
# (each in the range 0..upto).
# It's initialized with '1'
#
ar = [1] + [0] * (length - 1)
while True:
yield ar
# The index we're currently trying to
# advance
#
idx = 0
# Advance the current index. If it reaches
# the limit (upto), pefrorm a carry to the
# next index (digits)
#
while idx < length:
ar[idx] += 1
if ar[idx] <= upto:
break
else:
ar[idx] = 0
idx += 1
# We've reached the last number...
#
if idx == length:
break
An an example run of:
for s in multiselects(2, 3):
print s
Produces:
[1, 0, 0]
[2, 0, 0]
[0, 1, 0]
[1, 1, 0]
[2, 1, 0]
[0, 2, 0]
[1, 2, 0]
[2, 2, 0]
[0, 0, 1]
[1, 0, 1]
[2, 0, 1]
[0, 1, 1]
[1, 1, 1]
[2, 1, 1]
[0, 2, 1]
[1, 2, 1]
[2, 2, 1]
[0, 0, 2]
[1, 0, 2]
[2, 0, 2]
[0, 1, 2]
[1, 1, 2]
[2, 1, 2]
[0, 2, 2]
[1, 2, 2]
[2, 2, 2]
Note that the solution is general, as the lists it returns are lists of indices. These can be employed with any set to generate multi-subsets.
Background and links
I came up with this function while working on Project Euler’s problem 77. I ended up using a different method, but visualizing the possible partitions of primes was very useful.
Here are some interesting mathematical links related to this problem:
Related posts:

July 11th, 2009 at 19:26
Your call to multiselects(2, 3) didn’t generate [0, 0, 0]. Back to the drawing board!
July 11th, 2009 at 19:38
@Larry,
This is by design, because an empty subset isn’t interesting. If you need it, just change the initalization of
arin the function to:ar = [0] + [0] * (length – 1)
July 12th, 2009 at 14:33
For Euler problem 77 (and many other ‘counting’ problems) Dynamic Programming is very useful.
http://en.wikipedia.org/wiki/Dynamic_programming
Its worth looking into if you are not already familiar with it.
July 12th, 2009 at 19:06
@Steve,
Yep, project Euler would be hard without DP, which was what I used for problem 77 (with a similar algorithm to 76 and 31)
July 13th, 2009 at 15:36
You can also use itertools.product for this, at least in Python 2.6 and later:
I’m not sure multi-subset is a good name, since it suggests that order shouldn’t matter: that is, that {1, 1, 2} should be considered the same as {1, 2, 1}.