

Thanks for the notice, I really forgot to mention the simplest algorithm for generating permutations. I've added it in the Discussion part of the recipe. 

My five cents...
If items of array are from very small set, for example chars 'A'..'Z' of digits '0'..'9', there is seem to be more simple to understand algorithm: first group all same items and then recursively populate resulting array from first index to last. Sample code below
static void sub(int[] counters, char[] word, int p)
{
if (p == word.Length)
Console.WriteLine(word);
else
for (int i = 0; i < 26; i++)
if (counters[i] > 0)
{
counters[i];
word[p] = (char)(i + (int)'A');
sub(counters, word, p + 1);
counters[i]++;
}
}
public static void generate(string s)
{
int[] counters = new int[26];
for (int i = 0; i < s.Length; i++)
counters[s[i]  'A']++;
char[] word = new char[s.Length];
sub(counters, word, 0);
}
static void Main(string[] args)
{
generate("HELLO");
}
Point of harvesting our generated resuts is Console.WriteLine(word); it is somewrere deep inside recursion. If it is required to bring it to top just to make code more readable, c# users can use IEnumerable yield technique:
static IEnumerable<string> sub(int[] counters, char[] word, int p)
{
if (p == word.Length)
yield return new string(word);
else
for (int i = 0; i < 26; i++)
if (counters[i] > 0)
{
counters[i];
word[p] = (char)(i + (int)'A');
foreach (string w in sub(counters, word, p + 1))
yield return w;
counters[i]++;
}
}
public static IEnumerable<string> generate(string s)
{
int[] counters = new int[26];
for (int i = 0; i < s.Length; i++)
counters[s[i]  'A']++;
char[] word = new char[s.Length];
foreach(string w in sub(counters, word, 0))
yield return w;
}
static void Main(string[] args)
{
foreach(string w in generate("HELLO"))
Console.WriteLine(w);
}


I don't think this recipe needs this. There are SRM problems which need just what the recipe says  iterating over all permutations. More advanced techniques are not universal, and we'll have recipes about them (namely "Constructing Combinatorial Objects" and "Finding the lexicographically Kth combinatorial object"), but this recipe is hardly a place for them. 

That's right and that's because I mention that this algorithm fits only for small values of n. But, first, the recipe is named as it is named, and going to be the one for beginners; second, I don't know any general method of making cutting off in iterating over combinatorial objects  it's should be individual for every such problem; and third, problems where we need to do this are relatively rare in SRMs. 

I think the discussion should have more emphasis in the best practice of using this kind of tool; Just like it is mentioned and used in the MatrixGame & TheLuckyString problems, we need to generate certain rules/filters to optimize the search. To look for all n! possibilities probably never will be a feasible solution. This is also called Combinatorial Explosion which verbatim refers to a growth almost impossible to sustain for the time of this type of TC problems.
So more important than the implementation ( which is assumed know next_permutation() that this is called Algorithm L in Knuth's TAOCP 4: 7.2.1.2) it is the good use some criteria for pruning combinations (permutations) and at the end we obtain a reasonable running time.
Many problems use this kind of boobytrap to let the programmer to try to solve with a plenty quick solution which runs over all permutations (or combinations) that finish with a timeout or an unnecessary overhead as resources used (such us memory). 

OK, as you insist, I added this estimation to the recipe (what made the post too big, so I splitted it). 

As for the problem example, you can look at TheLuckyString. And this is Java solution where we do exactly what we say in previous paragraph – generate all permutations straightforward and check them for required property (no two adjacent elements are equal):
import java.util.*;
public class TheLuckyString {
public boolean my_next_permutation(int[] a) {
int n=a.length;
int i,j,k,temp;
i=n2;
while (i>=0 && a[i]>=a[i+1]) i;
if (i<0) {
for (j=0,k=n1; j<k; j++,k) {
temp=a[j];
a[j]=a[k];
a[k]=temp;
}
return false;
}
j=n1;
while (a[i]>=a[j]) j;
temp=a[i];
a[i]=a[j];
a[j]=temp;
for (j=i+1,k=n1; j<k; j++,k) {
temp=a[j];
a[j]=a[k];
a[k]=temp;
}
return true;
}
public int count (String s) {
int a[]=new int[s.length()];
int i;
for (i=0; i<s.length(); ++i) a[i]=s.charAt(i);
Arrays.sort(a);
int ans=0;
do {
for (i=0; i+1<a.length; ++i) if (a[i]==a[i+1]) break;
if (i+1==a.length) ans++;
} while (my_next_permutation(a));
return ans;
}
};
And this is more compact C++ solution, in which we will iterate from the last permutation to the first one – just to show that there are no major differences:
#include <vector>
#include <string>
#include <algorithm>
using namespace std;
class TheLuckyString {
public:
int count (string a) {
int i;
int ans=0;
sort(a.begin(),a.end());
reverse(a.begin(),a.end());
do {
for (i=0; i<int(a.size())1; ++i) if (a[i]==a[i+1]) break;
if (i==int(a.size())1) ++ans;
} while (prev_permutation(a.begin(),a.end()));
return ans;
}
};
Another example may be MatrixGame. To solve it, we should search over all permutations of columns: got next permutation of columns, we sort obtained rows lexicographically and compare the whole matrix with current answer. There are two thin details: first, we should not so take a fancy of next_permutation function as using it for generating all permutations of rows also, because it will result in total (8!)^{2} possibilities, what is too much to go into time limit; second, we cannot generate the next permutation of columns by generating next permutation of every row of the matrix simultaneously, since for distinct rows there may be distinct numbers of permutations of elements (because of equal elements in the same row) and even there are two rows with the same numbers of permutations, they may have distinct lexicographically numbers of initial permutations, and will not change equally under effect of next_permutation – so we should create a vector indicating current permutation of columns, and rearrange elements in the matrix after getting every next permutation of that vector’s elements. This is the solution:
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
class MatrixGame {
vector<string> ans;
vector<string> now;
public:
vector<string> getMinimal (vector<string> a) {
int i,j;
vector<int> v;
ans=now=a;
for (i=0; i<a[0].size(); ++i) v.push_back(i);
do {
for (i=0; i<a.size(); ++i)
for (j=0; j<a[i].size(); ++j)
now[i][j]=a[i][v[j]];
sort(now.begin(),now.end());
if (now<ans) ans=now;
} while (next_permutation(v.begin(),v.end()));
return ans;
}
};
And the last example will be TheMoviesLevelTwoDivTwo. Here we should iterate over all possible permutations of movies and punctually simulate John’s scare level during watching them. As every movie has two parameters – length and scary moment – we can join these two ints into pair<int,int> and permutate such pairs, or also create an array of indices. 

I think a short mention that next_permutation takes a constant amount of time on average would be more useful than the current simple analysis, because:
 everyone at this point should be able to do this simple calculation themselves, so it is not interesting (especially that it is naive),
 even if there is no direct practical importance in this case, I think it would be good for general educational purpose to note that some algorithms work much faster than one could get by just counting the nesting of loops,
 actually, what really matters in this case in TopCoder is not the time generating all permutations, but time of doing some stuff (X) for all of them. To do this, you obviously have to do the X thing N! times. So even if N is 8 and the X thing is something very complex, the "try all permutations" algorithm does not work.


Yes, it has been pointed already by it4.kp, but I think it isn't necessary to give the best asymptotical estimate in this case.
I don't know about swap function: even coding this program in Java was rather hard for me. But as there are nobody who claims that there is one, I suppose there isn't :) 

So if there are no equal elements in the array, time complexity of whole algorithm is O(n!*n).
That is not the whole truth. The next permutation algorithm always inspects two last elements, but the third one is inspected once per 2 times, the fourth one is inspected once per 6 times, the fifth one once per 24 times, etc. Thus, the time complexity is actually
O(n!+n!+n!/2+n!/6+n!/24+n!/120+...) = O(e*n!) = O(n!)
There is no swap function in Java, right? (Otherwise my_next_permutation should use it) 

As the instance of Nickolas I've supplemented "Discussion" section with new details and had to split initial post into two parts because of size limit. 

Discussion “Objects supporting comparison” may be not only numbers, but characters, strings, etc. – any objects for those operation of comparison is defined.
And there are some examples of getting next permutation: 3 2 4 3 1 → 3 3 1 2 4 1 1 2 2 → 1 2 1 2 And getting first permutation from the last one: 5 4 3 2 1 → 1 2 3 4 5 Using these tricks, you may not only generate all permutations from the first one to the last one in increasing order. Of course, you may also generate them from the last one to the first one in decreasing order – you should just generate the last permutation by reversing the first one and write function my_prev_permutation, which should be a twin to my_next_permutation with operators >= comparing array elements turned into <=. You can also generate all permutations starting from and stopping at not only first or last, but any arbitrary one. But you should make sure of you really iterate over all permutations. In particular, when iterating from the first one to the last one it means that you should sort your array lexicographically beforehand. And if somebody in your room misses this point, it may be good time for you to challenge her solution quickly.
But an interesting question may be: Why does my_next_permutation get exactly lexicographically next permutation from a given one? Let’s examine it intently. To generate the next permutation we should find such an element a_{i} (with maximal possible i) that can be replaced by such an element a_{j} that a_{i}< a_{j} and i<j> (with minimal possible a_{j} under these constraints), then swap a_{i} and a_{j} and reorder all elements with indices more than i nondecreasing. And our my_next_permutation does exactly this. First of all, we go from the last element to find the first such element that has greater right neighbor – and this is a_{i}. We see that the sequence of elements with indices greater than i is ordered nonincreasing. Then we find the minimal element with index j>i such that a_{j}>a_{i} and swap these two elements. Because old a_{i} was greater than or equal to a_{j+1} (if j<n1>, of course), new sequence of elements with indices greater than i is still ordered nonincreasing, so we should just reverse it to obtain nondecreasing sequence.
Another question may be: What is the time complexity of algorithm? Answer is T(n)=A(n)*B(n), where T(n) is time complexity of whole algorithm of iterating all permutations, A(n) is an estimate of number of permutations, and B(n) is time complexity of getting next permutation. Of course, if we have an array with all elements distinct, A(n)=n!, and if there are k distinct groups of equal elements with size S_{i} of ith group, an answer should be n!/(S_{1}!*…*S_{i}!*…*S_{k}!). And B(n)=O(n), as we can see easily, because there are only three nonnested cycles in function my_next_permutation, and each of them run along the some part of array of length n. So if there are no equal elements in the array, time complexity of whole algorithm is O(n!*n). But, to be honest, that’s not the best asymptotical estimate for this algorithm. If we examine it more deeply, we recognize that two last elements are inspected at every iteration, but the third one – once per 2 times, the fourth one – once per 6 times, and so on. So the time complexity is O(n!+n!+n!/2+n!/6+n!/24+n!/120+...) = O(e*n!) = O(n!).
Such an estimation of time complexity shows that this algorithm can be used only for small n, about 10 or 12. It is useful when you want to gather something information about every permutation and you can’t think of method giving this information without straightforward searching over all permutations. So our algorithm is some kind of brute force search. When there is a greater constraint on n, you should use something cleverer. On the other hand, if n is even lesser you can use even simpler recursive algorithm, in which at every recursion step (which is determined by current position, for which you choose an element to put in) you should iterate over all elements which haven’t been used in the prefix of the permutation and process the whole permutation every time you reach the last position. 

1) For answer your question, crusial words in this paragraph are maximal and minimal. It generates exactly the next permutation (and not arbitrary lexicographically greater than given one) because of this.
2) I mention it :) Just reread attentively. 

Why does my_next_permutation get exactly lexicographically next permutation from a given one? Let’s examine it intently. To generate the next permutation we should find such an element ai (with maximal possible i) that can be replaced by such an element aj that ai< aj and i<j (with minimal possible aj under these constraints), then swap ai and aj and reorder all elements with indices more than i nondecreasing.>
I know that this works, but I am having a hard time understanding why this works. In other words, why does this procedure give us the next lexicographical permutation, rather than some other permutation?
Also you can mention that next_permutation (at least the C implementation) ignores repeated permutations. So if you have repeated elements then 0,0,1,1 will go to 0,1,0,1 (instead of 0,0,1,1 again). This is a very useful feature that allows us to solve problems with long permutations (eg. length 20). For example see this solution, which did not time out on the challenge case. 
