JOIN
Get Time
forums   
Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings
View: Flat  | Threaded  | Tree
Previous Thread  |  Next Thread
Iterating Over All Subsets of a Set | Reply
Problem:
Iterate over all the subsets of a set.

Solution:
A subset of a set is a set that contains all or some of the elements of that set. For example if the set is {3, 5, 7} then some of its subsets are {} - the empty set, {3, 7}, {5, 7}, {5} and {3, 5, 7} - the original set. Often we need to find a subset with a particular property. If the original set is small, ie. has a small number of elements then it is sufficient to generate and test all its subsets. If the original set has n elements then it will contain 2^n subsets. Assuming that the computation required for each subset is small, we can usually do this for n up to 20.

Let us represent a subset with a unique binary number i. This way there will be a total of 2^n such unique numbers. The k-th element of the set can be either included or not included in the subset. We can associate the inclusion of the k-th element with a value of 1 at the k-th bit of i, similarly if the k-th bit is 0 then the k-th element is not included. For example, here is how we can associate a unique binary number from 0 to 7 with every subset of {3, 5, 7}:

Binary Number Subset
000 {}
001 {7}
010 {5}
011 {5, 7}
100 {3}
101 {3, 7}
110 {3, 5}
111 {3, 5, 7}

Now all that remains is to iterate over all the subsets. Since these are just binary numbers, it suffices to iterate from 0 to 2^n-1 inclusive. In the following Java example we iterate over all the subsets of the set {"appple", "banana", "carrot"}. After generating a subset we can perform any action. Here for simplicity we print it. Notice that the 0-th subset is the empty subset, as it contains no elements.

String[] set={"apple", "banana", "carrot"};
int n=set.length;
 
// iterate over all the subsets
for (int i=0; i < (1<<n); i++)
{
    // generate the i-th subset
    Vector subset=new Vector();
    for (int k=0; k < n; k++)
    {
        if ((i&(1<<k)) > 0)
            subset.add(set[k]);
    }
            
    // perform an action over the subset, here just print it
    System.out.print("Subset "+i+":");
    for (int k=0; k<subset.size(); k++)
        System.out.print(" "+subset.get(k));
    System.out.println();
}


Discussion:
The above procedure is very useful as it can be applied to many problems. One simple extension is to only generate subsets with no more than m elements. To achieve this we can modify the i-loop so that it only iterates over i that do not have more than m bits set to 1:

// iterate over all the subsets with no more than m elements
for (int i = 0; i < (1<<n); i=Integer.bitCount(i) < m ? i+1 : (i|(i-1))+1)
{
    ...
}


Let us decipher i=Integer.bitCount(i) < m ? i+1 : (i|(i-1))+1. We know that i+1 can only add at most one extra bit to i. So if the number of bits in i is less than m, then i+1 will not have more than m bits. Otherwise if the number of bits in i is m, we set i to (i|(i-1))+1. This value is i plus its lowest bit, ie it is the smallest integer greater than i whose bit count is no more than i's bit count. Since i starts with 0, we can see that this approach can never lead to i that has more than m bits. Warning: this does not work for m=0.

Another trick is to iterate over all the subsets of a subset in O(3^n) time:

// iterate over all the subsets
for (int i=0; i < (1<<n); i++)
{
    // iterate over all the subsets of the i-th subset
    for(int i2 = i; i2 > 0; i2 = (i2-1) & i)
    {
        // generate the subset induced by i2
        ...
    }
}

Warning: i=0 (empty subset) is a special case and must be treated separately.

Finally, there are many other operations that can be performed on sets: union, intersection, subtraction, element addition, element deletion. All of these can be done with simple and extremely fast bitwise operations. If A and B are two sets whose bit-masks are a and b respectively then we have the following:

Union of A and B: a|b
Intersection of A and B: a&b
A "minus" B: a&(~b). This is a set of elements in A that are not in B.
Add k-th element to set A: a |= 1 << k
Delete k-th element of set A: a &= ~(1 << k)

For an excellent description of bitwise operations take a look at bmerry’s recipe.

Now let's solve some problems that use these techniques. Let's solve BouncingBalls. For this problem, the number of balls (n<=12) is sufficiently small that we can generate all possible starting states. Each ball can be rolled either left or right, so we have 2^n possible starting states. Let dir[i] be the initial direction given to the i-th ball, 1 for right and -1 for left. A collision will occur between balls k and m if ball k is to the left of ball m (x[k]<x[m]), ball k is rolled right (dir[k]==1), ball m is rolled left (dir[m]==-1) and the distance that each has to roll is not more than the given time T (x[m]-x[k]><=2*T). For each possible starting state, we count the number of possible bounces for every pair of balls k and m. To compute the expected number of bounces we divide this value by the total number of starting states (1<<n).
>
public class BouncingBalls
{
	public double expectedBounces(int[] x, int T)
	{
		int n=x.length;
		int bounces=0;
 
		//iterate over all possible starting states
		for (int i=0; i<(1<<n); i++)
		{
			//construct the current starting state of directions
			int[] dir=new int[n];
			for (int k=0; k<n; k++)
			{
				if ((i&(1<<k)) > 0)
					dir[k]=+1;		//roll ball k right
				else
					dir[k]=-1;		//roll ball k left
			}
			
			for (int k=0; k<n; k++)			//left ball
				for (int m=0; m<n; m++)		//right ball
					if (x[k]<x[m] && dir[k]==1 && dir[m]==-1 && x[m]-x[k] <= 2*T)
						bounces++;
		}
		
		return bounces*1.0/(1<<n);
	}
}

Note that this problem can be solved more easily without iterating over all the subsets. It can be shown that it is enough to add 0.25 for each pair of balls that are close enough. See Egor's solution that implements this idea.

EDIT: Added Eryx's suggestion.
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
This is quite similar to my original cookbook article. I've added some more examples of subsets and how to map them to binary numbers. I think these help in understanding. This article is missing some example problems.
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
Two last paragraphs can actually be replaced with a link to recipe http://forums.topcoder.com/?module=Thread&threadID=671561&start=0 . And yes, this needs two examples of problems solved using this. Otherwise looks good.
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
Now let's solve VectorMatching. In this problem we are given n points and we are asked to find a vector matching that has the minimal length. The number of points is small (n=20) which allows us to iterate over all possible vector matchings. A vector matching consists of two sets of points A and B, such that the head of the i-th vector is the i-th point in A and its tail is the i-th point in B. A and B should have the same number of elements, namely n/2. Each point can be either in set A or in set B, so we have 2^n possible states. However, we only need to consider those states where the size of A is n/2. This reduces the problem to Choose(20,10) = 185756 states.

Suppose the point (x1,y1) is in A and the point (x2,y2) is in B. The vector represented by these two points is (x2-x1,y2-y1). Hence if a point is in A then it contributes negatively to the vector sum; otherwise it is in B and it contributes positively. So now we can determine the vector sum (which is a vector) as we iterate through the points in A and B; its x-component will be sumX and its y-component will be sumY:
public class VectorMatching
{
  public double minimumLength(int[] x, int[] y)
  {
    double result = Double.MAX_VALUE;
    int n = x.length;
    for (int i = 0; i < (1 << n); i++)
    {
      //size of A and B must be n/2
      if (2*Integer.bitCount(i) != n) continue;		
 
      long sumX = 0;
      long sumY = 0;
      for (int k = 0; k < n; k++)
      {
	//point (x[k],y[k]) is in B
	if ((i&(1<<k)) > 0)
	{
          sumX += x[k];
          sumY += y[k];
        }
	//point (x[k],y[k]) is in A
	else
	{
          sumX -= x[k];
          sumY -= y[k];
        }
      }
 
      double length=Math.sqrt(sumX*sumX + sumY*sumY);  //vector length
      result = Math.min(result,length);  //record the minimum length
    }
    return result;
  }
}
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
the cookbook example is good and its helping me. Can u site some problem of this type as i am learning how to program.

Thank in advance
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
I think it should be mentioned that BouncingBalls can in fact be solved more easily without iterating over all subsets. (By changing the order of summation by summing over "i" inside the "k" and "m" loops, it is easy to show that it is enough to add 0.25 for each pair of balls which are close enough.)
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
// iterate over all the subsets with no more than m elements
for (int i = 0; i < (1<<n); i=Integer.bitCount(i) >< m ? i+1 : (i|(i-1))+1)
{
...
}

why i can't modify it as below?

for (int i = 0; i < (1<<n) && Integer.bitCount(i)><=m; i++)
{
...
}
Re: Iterating Over All Subsets of a Set (response to post by savon_cn) | Reply
First of all your code tests all 2^n elements, which makes it much slower. Secondly your code will terminate as soon as it reaches a number with more than m bits set to 1. This means that it will not find all subsets with no more than m elements, only some of them.
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
Another useful bit twiddling hack is gospers hack, which calculates the next greater number, that has the same number of bits set in binary representation. This can be used in a loop, to run through those subsets with k elements.
Re: Iterating Over All Subsets of a Set (response to post by holomorph) | Reply
How can we extend the idea of to generate subsets with k elements. I can simply iterate over all 1<<n states and counting the number of set bits in each of the subset. If it equals k then I'll take into consideration. Is there any better way of doing this ? Can't we loop over only those sets which have k set bits directly instead of iterating over all 1><<n sets ?
Re: Iterating Over All Subsets of a Set (response to post by d@rk_sh@dow) | Reply
You can either use the method described above in Discussion (for m or less bits). Or you can use a more direct method described in NextBitPermutation here: http://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
You said "Another trick is to iterate over all the subsets of a subset in O(3^n) time", can you explain a bit more how O(3^n) was obtained? Thanks.
Re: Iterating Over All Subsets of a Set (response to post by hacker007) | Reply
There are exactly 3^n pairs (A,B) where B is a set of elements of {1, ..., n} and A is a subset of B. To see this, notice that each pair of sets like this can be associated to a n-digit ternary number (with leading zeros, possibly): Put a 0 in the i-th position if i is in neither A nor B, an 1 if i is in B but not A, and a 2 if i is in both sets. Since this is clearly a bijection, we've proved the result.

If you take O(1) time per subset while iterating, you then have a O(3^n) algorithm for iterating over all pairs as above.
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
How will the method to go over all subsets of a set change if the numbers are not unique. For example if we need to find subsets of {1,1,2,3,4,4,4} such that no set is repeated i.e. We don't need to have {1} twice. But {1,1} will be a valid set.

What is the best way to do it?
Re: Iterating Over All Subsets of a Set (response to post by puneetm) | Reply
I don't know how to do that. I guess you could keep a HashSet of all subsets that you have seen (you need a good hash function). If you meet a subset that you have already seen then skip it.
Re: Iterating Over All Subsets of a Set (response to post by puneetm) | Reply
You can change to another representation of the set: (element,count). The set is then represented as {(1,2),(2,1),(3,1),(4,3)}. For each element, loop through each possible count. In this case, there are 3*2*2*4 subsets to loop through (including the empty set).
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
can you please explain what this line is doing " if ((i&(1<<k)) > 0) " ?
Re: Iterating Over All Subsets of a Set (response to post by sjsupersumit) | Reply
It's checking if bit k of i is set, since 1<<k == 2^k.>
Re: Iterating Over All Subsets of a Set (response to post by dimkadimon) | Reply
the mapping from bit to subset is limited.
when the set size is more than 32 or 64
the subset number will be more than Integer.MIN_VALUE or Long.MAX_VALUE.
Re: Iterating Over All Subsets of a Set (response to post by ZuBruce) | Reply
You can always use a BigInteger. Anyway iterating over such large sets is going to be too slow.
RSS