

My approach had 3 parts:
1. count colors in c1 guesses (ideally this would be nearly skipped) 2. find a totally wrong guess 3. keep a set of all possible combinations of a subset of each position color pair. generate guesses isolated to the subset (simulate to find best avg guess) and filter out the ones which are not consistant. if this set gets below a certain size add another position color pair subset. 

What I've managed to intuit is that, more often than not, NP complete problems tend to have solutions which can be completely different if the inputs are modified by a constant number of 'elements' (like a constant size graph attachment, or something). This is the case in Minimal Vertex Covers, the Knapsack Problem, Integer programming, Kcoloring a graph, the traveling salesman problem, probably most of the others as well. I don't think that the converse is true.
Also, it's not true that you need to iterate over the entire, or most of the state space for NP complete problems: there are some pretty interesting new results which drastically reduce the number of states you need to examine in integer programming, not to mention the classes branch and bound. It's pretty cool. 

My approach was very similar to gsais's.
As a first step, I used up to (K1) guesses to find the color counts, but at every step, I used the previous information and random swaps to extract more information (so if the first guess 00..00 shows that there are 2 0s in the code, my second guess was a random permutation of 2 0s and L2 1s, and so on).
In the second stage, I generated a few random permutations of the correct colors, and using single swaps, I tried to get a code, which is as consistent with the previous answers as possible. My consistency metric was the same as gsais's, Sum(Abs(Expected matched  Actual matches)).
This algorithm works the best, if at each step a large number of random permutations are examined, so after TheFaxman told that inline assembly is allowed I rewrote my optimizebyswaps code to use SSE2, giving it about sixsevenfold speed increase. So my final code examines 3.7e10 / (L^4 * log2(K)^2) combinations at each call to nextGuess(), that ensures that runtime will remain around 1214 seconds even in worst case (according to local test, that means a score of 1.6).


First, I decided that it was entirely likely that a random guess consistent with the guesses before it would divide the remaining state space by O(a constant factor), which would locate the solution in O(length * lg (k)) time (since the state space is of size k^l). I knew that this line of reasoning was in the right direction when I noticed that the scoring formula was O(length * lg(k)).
Therefore, any method of generating consistent guesses fast should score well.
Initially I thought about genetic algorithms, and then decided that diversity could be substituted with a more randomized guess. I guess I kind of moved along the path of venco's solution.
I thought that the trade off of looking through k  1 colors was worth it, because it drastically reduced the state space when you knew which colors you had. This was only a problem because I couldn't figure out how to glean color information analytically from disorderly guesses! I wouldn't need it otherwise.
Unfortunately I started first with an entirely randomized guess. This was terrible!
Next I implemented a 'voting' system where a result with X black pegs would get (k  1) * X votes for [each color, position] pair in that guess, and X votes for each [color != the color in the guess, position] pair.
This was a step in the right direction, but unfortunately missed conditional probability. I couldn't figure out a good way to calculate the actual conditional probability of each color appearing in each position based on the passed guesses. If I had, or found a better way of estimating, I probably would have been much closer to venco's solution. Looking back I could have limited the order of the conditional probability generation and probably would have done well.
Each guess was allocated randomly with a probability density proportional to the votes[color,position]. Then each guess was scored against the previous guesses, and if it was within 3 pegs total (summed across all other guesses) the guess would hill climb toward a consistent guess using all 2swaps and 3swaps available to it.
 This Scored Horribly 
I struggled with making some improvements, but I knew the problem was with the voting system.
At this point I saw that many people, particularly with lower ratings, had made scores ~50. They must be using algorithms almost guaranteed to converge, so I thought. I need to reduce the state space.
So I did.
Algorithm, Take 2:
First, I noticed that while cycling through the colors to count them, you can put one color that is known to exist in any slot, and determine if it's meant to go there or if the cycling color is meant to go there based on the number of white pegs you read.
I use this concept to linearly search through the color with the most unidentified positions, nailing a few slots down in the process.
If a color has no pegs, we use this color as a 'background' color. Otherwise, we continue the linear search, using the two most ubiquitous remaining colors as the sweeping color and the color that filled the rest of the space.
Next, I binary search through each color against the 'background' color, starting with the most numerous. The binary searching improved the score from ~50 > 73, but still left something to be desired.
I had three remaining ideas:
1. Change from binary searching to something close to theoretically optimal for finding x of the same colors in an interval of length l, when l is small enough. 2. When the number of unknown positions remaining is low, solve the entire problem with something close the theoretically optimal. 3. If the state space for the whole solution is small, solve closetotheoretically optimally, skipping the color solving.
The closetotheoretically optimal method I decided on (for 1) involved creating all length l bitfields consistent with previous guesses, and guessing at each step the bitfield which had the highest average number (calculated combinatorially) reductions of the remaining number of NumColorInRange, where NumColorInRange is the number of pegs of the current color that is known to be within the interval. I guessed that this should be O(length) in guess length (as above), which was confirmed by my experiments.
For 2, I would have done essentially the same thing, except instead of bitfields I'd have strings.
1. appeared to improve the solution by a little bit. I then needed to figure out how large I could make the largest closetotheoretically optimal search. I found that I could make length <= 13, but this ran out of time occasionally, so I also stipulated that if the wallTime > 10 seconds instead length <= 12.
This was my last submission, scoring 77.8. I wasn't able to finish 2, or 3, and I would have been able to increase the length significantly if I have optimized my method further. I bet this approach could reach scores into the 90's if done properly, but I definately ran out of time.
I'm glad that my initial idea was on the right track, but I'm disappointed that I couldn't figure out how to construct information from disorderly guesses very well. venco, do you think you can describe a bit more how you obtained the information analytically? Thanks in advance, and congratulations! 

Several coders have mentioned random swaps as the way to find compatible guess. My code does a bit more. It repeatedly removes 28 worst pegs, then adds them back in the best places. In each step if there are several possibilities with the same change in error, I pick one randomly. Similar algorithm was used in the first stage when I don't know color counts yet for finding compatible set of counts. 

I have a totally different approach which works pretty well. It goes like this:
Find the right combination of colors in k1 guesses using (0,0,0,0...) (1,1,1,1,...) etc.
Create some arbitrary guess using these combinations.
For each step, swap only 2 spots that have different numbered pegs. There are 5 cases possible for the resulting score change:
Let us say we previously guessed X in position A and Y in position B (X,A) and (Y,B), and the swap creates a new guess with (X,B) and (Y,A) with everything else remaining constant.
Score +2: (X,B) and (Y,A) are correct configurations.
Score 2: (X,A) and (Y,B) are correct configurations.
Score +1: The two pegs are contradicting in their new configuation  ie. if (X,B) is correct, (Y,A) is incorrect and vice versa. We note this by creating "contradiction lists" for each configuration in which we can look up the list of all the conflicting configurations that are connected to any configuration.
Score 1: The two pegs are contradicting in their old configuation  ie. if (X,A) is correct, (Y,B) is incorrect and vice versa. These are also added to the conflicting configuration list.
Score unchanged: (X,A) and (Y,B) are correlated, and (X,B) and (Y,A) are corellated  ie. if (XA) is correct then (Y,B) is correct and if (XA) is wrong then (Y,B) is wrong. The same is true with (X,B) and (Y,A). This is not entirely intuitive, but it is true. We create "correlation lists" for each configuration in the same way as above.
Now, when you do a swap, and you get a result of +2, then you can call those configurations correct. In addition, you can go through the list of contradictions and correlations for those configurations and mark them thusly. And for each of those, you can mark the configurations in their lists, and so on. We keep a set of "correct" and "wrong" configurations, not swapping any correct configurations and not swapping into two wrong configurations. For the +1 and unchanged score cases, we can check against our list of known wrong configurations and imply results in many cases.
Now, my submission had a tinybutimportant bug in it, as I did not prevent swapping into two wrong configurations (not a very helpful swap!), so my score sucks. It was 24.3. But with that error fixed, I went from never getting the k=20, l=100 case to getting it in an average of 1400 or so guesses with randomly seeded tests. Also, speed is not even relatively an issue as running of this algorithm with the largest cases takes about 1 second or so. I figure this is a decent overall result but I have no clue what my score would have been. The whole scoring thing for these matches is still a bit obtuse to me... 

Phase 1:
Determine the colors present in the code.
If all the possible colors were in the code, start with Phase 2a. Otherwise, jump to Phase 2b.
Phase 2a:
This phase is a variant of a simple brute force search where you change each peg one at a time until results[0] increases by one, then you move to the next peg. Instead of doing things 1 peg at a time, I did them 2 at a time. A little trickier than 1 peg at a time since there were some special cases that had to be handled, but still manageable.
After you've found the color for a peg, update the number of pegs left in the code of that color. Once you find a color that doesn't occur in the rest of the code, move to Phase 2b.
Phase 2b:
Similar to what other people did with guessing codes that are consistent with what has been guessed so far. Rather than guessing the whole code, I guess blocks at a time, starting at the leftmost part of the code that hasn't been determined yet. Once a "block" has been determined, move on to the next block. When randomly guessing a code, I would randomly pick results[0] pegs to keep from the last "best" guess, and then guess the remaining pegs using the probability distribution of the colors of the remaining pegs. (The last "best" guess is the one which I guessed that had the largest value of results[0].) 

I was new to Topcoder and such kind of problems, hence I probably did not make much use of the 20 seconds time alloted.
I tried to find a regular efficient simple direct solution, and and this resulted in generation of guesses with hardly any computation in almost no time.
Here's the thing I did:
Stage 1) Use k1 unicolor guesses to find the number of pegs of each color. I tried a lot to think of other ways instead of doing this way, but I couldnt find a way to make them simple enough to be coded!
Stage 2) Finding Background color: If there is any color with no pegs, skip this step, and and use that as the background color. Otherwise, do a linear search on the rarest color to find where it's pegs are located, and then it can serve as the background color for the remaining spots.
Stage 3) Now that we have a background color, we could replace any pegs with this color, and the reduction in the number of exact matches would correspond directly to the correct pegs we have displaced. So using this, I did a binary search for each color's pegs  basically replacing the pegs I was interested in by the background color, and thus finding how many pegs of a color are located there, and then changing the region of interest in a binary search fashion. Repeating this for each color, I was able to find the actual code.
Using this approach I scored around 65. Then I made an improvement in stage 2, by using a modified binary search to find the locations of the rarest color thereby reducing a possible whole pass of k guessses.
This got me to score ~75.
I think this is a decent method  if the time taken to generate the guesses is also taken into account, this would compare fairly well I feel.
However, I felt this approach was not doing good l is approximately same or lower than k. 

It looks like skipping stage 1 is what allowed me to take the lead with large margin. I've added this stage to my final submission, and the score dropped to around 113. 

Yes, I guess when K >= L, "wasting" K1 guesses for color frequencies is not a good idea. 

Let len=l (size of array) 1) Perform (k1) guesses to compute freq of colors. 2) Set the whole array elements to the most frequent color. 3) Perform (len) operations, each one is only changing an elem to the next most frequent color. This will determine exactly all elems that belongs to one of the two most frequent colors. 4) During step 3, get the number of elems with most frequent color in the first half, and using this info, perform (k2) guesses to determine the frequency of each color in each half. 5) Iterate over all elements in the first half, changing two elems at one time to the most frequent color in their candidate list. By The most frequent color, I mean the color that has the largest frequency in the unsolved positions of the first half. 6) Make the same for second half. 

I did not get the method of guessing based on previous guesses. Can you please explain what is: Sum(Abs(Expected matched  Actual matches)), and how the new guess is selected? 

This is provably true! With K1 guess it is almost always possible to split the code in two equally sized parts and know the color distribution in both parts. That's how my solution basically works.
It continues to split each part in two until each part has less than about 1Million permutions and bute forces these. Additionally there are a few special cases for small lenghts. This gave me ~110 points so it can't be too bad. 

I did not get the method of guessing based on previous guesses. Can you please explain what is: Sum(Abs(Expected matched  Actual matches)), and how the new guess is selected? The method is very simple (in theory): always return a guess that is completely consistent with all the previous feedbacks. It goes like this:
a) A code is somehow choosed and postulated to be the secret key b) The postulated code is used as the key to score each of the previous guesses and the resulting scores are compared against the codemaker's feedbacks for those previous guesses c) If any of the comparisions fails (ie. we calculated a score of [3,5] but the codemaker's feedback was [4,5] for that guess) then it's said that the postulated code is "inconsistent" with the feedbacks (thus it can't possibly be the secret key)  another code must be choosed (repeat from step a) d) Otherwise, if all the comparisions are successful, then it's said that the postulated code is "consistent" with the feedbacks (thus it can possibly be the secret key)  the postulated code is returned as the next guess
There are two important observations to make about this method:
1) All of the returned guesses have a greater than zero probability of being the secret key 2) The total number of consistent codes always decreases with each guess made  therefore the probability of finding the secret key always increases with each successive guess
It's because of these two features that the method is tipically called "stepwise optimal strategy" and it has been already found to be nearoptimal on average (of course, optimal in this sense means relative to other algorithms  not to the ultimate optimal of always finding the answer in the first guess using shamanic powers). Strategies that are not stepwise optimal include "human" approaches (ie. slowly changing the guess to arrive at the answer by reasoning), and all the approaches that use a "background" color (ie. a color which is known to not occur in the answer or else that it's known where it's located within the answer).
As good and simple as the stepwise strategy is, it has a very important complication: altough finding a consistent code for small problems is rather easy, for larger problems is extremely hard. In fact, it has been already proved that just deciding if there exists a consistent code (a subset of the problem of finding it) is a NPcomplete problem; see lbackstrom's post above. Even for not so large problems (~ K=10 L=15) you can't actually hope to find a consistent code by simply generating random codes, you really need to use some heuristic to guide the search. However, even with good heuristics, for slightly larger problems it's almost impossible to find a consistent code (unless you already have enough hints), and for the very large problems the only thing that becomes easy is understanding the true meaning of "finding the needle in a haystack". This is where the "almost" stepwise strategies come, and they come in basically two flavors:
* Divideandconquer: use a background color or an analytical method to divide the problem in smaller problems and apply a stepwise approach for the smallest problems; see the approaches of lyc1977 (background color; discarding the impossible combinations is a way to find the consistent codes) and OvejaCarnivora.
* Least inconsistent code: instead of finding a completely consistent code with all the previous feedbacks, find the code that is the least inconsistent (within a subset of the search space). Essentially, a cost formula is used that measures how much inconsistent a code is, and any optimizing algorithm is used to minimize the cost. venco used the following cost formula (note: hits = feedback[0] = pegs with correct color and correct position; hints = feedback[1] = pegs with correct color but incorrect position):
cost = sum { abs(hits  hits') + 10000 * abs((hits + hints)  (hits' + hints')) } The 10000 factor is used to give more weight in the first moves to the total number of correct colors (hits + hints) than to the number of correct positioned colors (later, once the color frequencies are discovered the term becomes automatically zero). Now, since both vdave and I started the search already knowing how many pegs of each color were in the answer, the number of hints was completely irrelevant (because hits + hints = L if the code is a permutation of the right colors), thus the cost formula reduces to:
cost = sum { abs(hits  hits') } Furthermore, knowing the frequency of each color greatly limited the search space and also simplified the operation of finding the closest neighbours of a code (ie. codes with an approximate cost) to swaps. But these advantages didn't come without a price: as venco himself has already stated, his 14+ points advantage over vdave (in the tests) are very likely due to his algorithm skipping the color's frequencies discovery phase  yet he doesn't say that achieving that was far away from being an easy task at all (I can bet I wasn't the only one that unsuccessfully attempted that, so I'm looking forward to read exactly how he did it in the match editorial  right now I'm clueless edit: ok, now I understand how he accomplished it: it's thanks to the brilliant 10000 * diff(hits + hints) term in the cost function). On the other hand, vdave's "smart" discovery phase is rather intuitive (after one is told about it, of course...), easy to understand and implement, and it's also likely what allowed him to take the second spot in the tests.
It's interesting to notice that the first and second places used a stepwise strategy consistently during the whole game (ie. a wrong move was never knowingly played); what it's more interesting to notice is that the third place (flagitious) and probably the fourth place too (cae) used the almost exact opposite strategy: playing knowingly wrong positions throughout most of the game! I believe that each of these two strategies have strengths and weaknesses in different areas so none can be considered better "overall" than the other. In particular, I believe that flagitious's approach has a bigger potential to be much more scalable than any stepwise approach, but I don't want to speculate too much before the final results are published and we can see the results...
Edit  Fixed venco's cost formula 
