JOIN
Get Time
forums   
Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings
View: Flat (newest first)  | Threaded  | Tree
Previous Thread  |  Next Thread
Finding the Lexicographically K-th Combinatorial Object | Reply
Name
Finding the Lexicographically K-th Combinatorial Object

Problem
From all combinatorial objects of length n generated by specified rules you need to find the lexicographically k-th one.

Solution
Let k be 0-based. Then let’s suppose am,p to be a number of combinatorial objects of length m satisfying certain condition marked as p. A totality of all p’s for some m describes a condition which suffixes of length m of any our combinatorial objects of length n must satisfy (and every such suffix must satisfy exactly one of those p’s).

So the general solution is like this pseudo-code:
class CombinatorialObject {
	void getMthSymbol (int m, int p, int k, vector<symbol> &ans) {
		if (m==0) return;
		ans.push_back(symbol(m,p,k));
		getMthSymbol(m-1,newP(m,p,k),newK(m,p,k),ans);
	}
};

Where function symbol(m,p,k) returns an appropriate symbol, function newP(m,p,k) returns new condition for a suffix of length m-1, and function newK(m,p,k) returns a number of appropriate suffix under this condition. These two functions may rely on am,p. To obtain the answer, we should call getMthSymbol with arguments n, condition p that actually means no condition in addition to problem description, k, and an empty array ans.

But there are three important questions: how we should calculate a(m,p), newP(m,p,k), and newK(m,p,k). If only there was a universal method, we would describe it here, but modes of calculating these vary too much for different problems, so we explain them on examples.

Discussion
If k is small enough, you may use brute force algorithm generating combinatorial objects in lexicographical order and simply take the k-th one. Our sophisticated approach is for cases when there are too many objects to iterate throw for passing time limit.

I couldn’t find any SRM problems requiring this algorithm easier than Division One – Level Three, so I suppose it will be more understandable if I give some other examples.

In simple cases you can calculate am,p directly and find k using a formula. Let’s suppose we need to find k-th lexicographically array of n integers from 0 to n-1, inclusive. This is basically the task of writing down number k in n-based notation, and can be solved easier, but we'll handle it in a manner that illustrates our idea.
In this simple case there are no constraints p on suffixes depending on previously set symbols, and am,p (actually simply am) is just a number of arrays of length m where each element is an integer between 0 and n-1, inclusive. am=mn, so we precompute it in array pow[m] before calling recursion.
#include <vector>
using namespace std;
class Arrays {
private:
	vector<int> pow;
	void getMthSymbol(int m, int k, vector<int> &ans) {
		if (m==0) return;
		ans.push_back(k/pow[m-1]);
		getMthSymbol(m-1,k%pow[m-1],ans);
	}
public:
	vector<int> getKth(int n, int k) {
		pow.resize(n);
		int i;
		pow[0]=1;
		for (i=1; i<n; ++i) pow[i]=pow[i-1]*n;
		vector<int> ans(0);
		getMthSymbol(n,k,ans);
		return ans;
	}
};
Re: Finding the Lexicographically K-th Combinatorial Object (response to post by Ferlon) | Reply
Discussion (sequel)
Another similar task is finding k-th lexicographically permutation of n distinct integers from 0 to n-1, inclusive. Here am,p is a number of permutations of m distinct elements, what is also doesn’t depend on previously set elements and thus can be marked as am (and am=m!). Even more, function newK also doesn’t depend on previously set elements – when we choose next element, we should just let through all permutations of length m starting with elements going lexicographically earlier than chosen one – and if this chosen one is number t (0-based) from all now possible ones (with taking into account previously set elements), the number of all permutations let through will be am-1*t, so we can set newK(m,k)=k%am-1. But what is actually depend on previously set elements, it is function symbol(m,p,k). Here p means the set of already used elements. As we understand that number of all permutations of m elements starting with definite one is am-1, we can calculate our required t=k/am-1, and then find t-th unused element. In code below we don’t perform operations of division and taking the remainder, but decrease the value of k iteratively instead; and array b[] stores marks for used elements.
#include <vector>
#include <cstring>
using namespace std;
class Permutations {
	static const int c=15;
	int a[c];
	bool b[c];
	int n;
	void getMthSymbol(int m, int k, vector<int> &ans) {
		int i;
		if (m==0) return;
		for (i=0; i<n; ++i) if (!b[i]) {
			if (k>=a[m-1]) k-=a[m-1]; else {
				b[i]=1;
				ans.push_back(i);
				getMthSymbol(m-1,k,ans);
				break;
			}
		}
	}
public:
	vector<int> getKth (int nn, int k) {
		int i;
		n=nn;
		a[0]=1;
		for (i=1; i<=n; ++i) a[i]=a[i-1]*i;
		memset(b,0,sizeof(b));
		vector<int> ans;
		getMthSymbol(n,k,ans);
		return ans;
	}
};

For more complicated example, let’s look at problem in which we need to find k-th lexicographically regular bracket sequence of length n. A regular bracket sequence is a sequence consisting of equal numbers of left and right brackets, and in every its suffix the number of left brackets is less than or equal to the number of right brackets. There am,p will be a number of suffixes of length m such that the difference between number of right brackets and number of left brackets in each of them is equal to p (let’s call it balance). So each suffix with zero balance is a regular bracket sequence itself.
First of all, we calculate am,p using dynamic programming. There is only one (empty) suffix of length 0, and its balance equals to 0 too. Then, when we add one bracket to the left thus increasing length of the suffix on 1, we may come to zero balance only by adding left bracket to the suffix with balance 1 (a[i][0]=a[i-1][1]), but we may come to nonzero balance by two possibilities – adding left or right bracket, decrease or increase balance (a[i][j]=a[i-1][j-1]+a[i-1][j+1]).
Then we start to reconstruct k-th lexicographically regular bracket sequence. When we need to find k-th suffix of certain length with zero balance, we simply add a left bracket to answer and go to finding suffix with balance 1, but when we need to find k-th suffix of length m with nonzero balance p, we should look at k to make decision which kind of bracket to add. If the number of suffixes of length m-1 with balance p+1 is greater than k, a suffix which must be added to already constructed left part of sequence is among them, but if this number is less than or equal to k, we need a suffix with balance p-1. So we add to answer an appropriate bracket and go further.
#include <vector>
#include <string>
using namespace std;
class BracketsSequences {
private:
	vector<vector<int> > a;
	string ans;
	void getMthSymbol(int m, int p, int k, string &ans) {
		if (m==0) return;	
		if (p==0) {
			ans+='(';
			getMthSymbol(m-1,p+1,k,ans);
			return;
		}
		if (k<a[m-1][p+1]) {
			ans+='(';
			getMthSymbol(m-1,p+1,k,ans);
		} else {
			ans+=')';
			getMthSymbol(m-1,p-1,k-a[m-1][p+1],ans);
		}
	}
 
public:
	string getKth(int n, int k) {
		int i,j;
		a.resize(n+1);
		for (i=0; i<=n; ++i) a[i].resize(n+2,0);
		a[0][0]=1;
		for (i=1; i<=n; ++i) {
			a[i][0]=a[i-1][1];
			for (j=1; j<=i; ++j) a[i][j]=a[i-1][j-1]+a[i-1][j+1];
		}
		ans="";
		getMthSymbol(n,0,k,ans);
		return ans;
	}
};

You may use similar method for combinatorial objects that are not of length exactly n, but are of length at most n. For this you should use “empty” symbol with am,p=1 for p associated with it.

As you see, there may be many very different problems solved with a help of this approach. Time complexities of algorithms using it generally depend on time complexity of dynamic schemes of calculating am,p, but in some cases time complexity of searching an appropriate next symbol may be significant too. For example, if at some m number of suffixes am-1,p depend directly on symbol chosen at this step, you may iterate over all possible symbols to find an appropriate (actually in our second example the situation is so, but there are only two different symbols), and if the number of possible symbols is great, and you, say, need to reply to many requests of getting combinatorial object by number, a naïve linear search may be too slow. So it is useful to prepare a cumulative array bm,p=am,0+…+am,p (if we can enumerate all possible p’s 0-based) and find an appropriate symbol and thus an appropriate p using binary search.

Perhaps last to add: Beware of trying to construct combinatorial objects from the end to the beginning! As we are speaking about lexicographic order, we should fix elements going from left to right, not vice versa. Of course, there are problems in which both approaches work correctly, but it is not right in general.
RSS