JOIN
 Select a Forum     Round Tables New Member Discussions News Discussions Algorithm Matches Marathon Matches NASA Tournament Lab TopCoder Cookbook High School Matches Sponsor Discussions Development Forums Design Forums Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings View: Flat (oldest first)  | Threaded  | Tree Previous Thread  |  Next Thread Forums TopCoder Cookbook Marathon Competitions - New Recipes 3.2 Setting Up Your Environment/Keeping Track of Your Progress
Re: 3.2 Setting Up Your Environment/Keeping Track of Your Progress (response to post by amiune) | Reply

Now you have your test environment set up and a couple of ideas to attack the problem, it is time to translate those ideas into code and test them. Keep the score of your naïve solution as the base to improve in the next submissions.

Saving Test Results

With your local tester you will have the ability to print the results in a file for later processing and analysis. Let suppose that you have decided that 100 test cases will give you a good statistic to compare different solutions. Then you can automate your local tester to process those 100 tests at once and print the results in a file using comma separated value format, xml or any other format that you like.

A CSV file of the results can look like this:

Test Case, Input Parameter1, Input Parameter2, …, Input ParameterN, Score, Run Time
1,23,”A”,…,245,3048.0,2340
2,45,”S”,…,347,588.3,2010

100,65,”S”,…,856,2488.9,4853

Having all this data for each solution will allow you to make comparisons like in the following example:

 Test Case Input Parameter 1 Input Parameter 2 Score Solution 1 Score Solution 2 1 23 A 3048.0 1045.3 2 45 S 588.3 1039.4 3 35 A 2695.4 987.4 4 87 S 659.4 985.8 5 43 S 458.2 704.9 6 86 S 674.1 830.4 7 24 A 3485.2 2346.3

In this example you can infer that the Solution 1 scores much better than Solution 2 when the Input Parameter 2 is equal to A and Solution 2 scores much better than Solution 1 when the Input Parameter 2 is equal to S, so you can merge the two solutions to improve your score by doing something as:

if(inputParameter2 == “A”) runSolution1();
else runSolution2();

This is just a simple example to show why is important to write your own testing program and how you can take advantage of it.

Keeping track of your solutions and their score is very important because a lot of times you will end writing a solution that scores worst than previous ones so you will want to rollback everything you did.

The simplest way to track your solutions is to make a full submission to TopCoder server every time you think you have improved your solution. This way all your submissions code and their score will be saved in TopCoder servers and then you will be able to see each solution score download any of them. This is the simplest way to track your solutions but it has several drawbacks, you don’t have much detail about what changes did you make to each solution, you only have the final score of your solution and not a detailed score per test case and if the score is relative to the score of the other participants then it will be hard to compare your different solutions.

The best way to track your solutions is to use a revision control tool, also known as version control tool or source control tool. With these tools you can save every little change that you make to the code and add a note of why are you doing that, you can also rollback any change you made. Another useful feature of these tools is that you can fork your base solution to try different approaches and then merge into one.

There are several free tools to do source control, two of the most used tools are:
GIT (http://git-scm.com/)
Subversion and TortoiseSVN (http://tortoisesvn.tigris.org/)

GIT is more popular in Unix like environments and SVN is widely used in Windows environments.

The purpose of this recipe is not to explain these source control tools in depth so the following is just a brief tutorial that can help you to get started fast with GIT to work on your Marathon Match code:

1. Install git
2. Create a git repository where your project folder and files are. To do that just change directory (cd) to your project folder and then run the command
git init
2. Add all your files to the repository by using the command
3. Commit your changes to the repository
git commit -m "Initial commit"
(This command will record your changes to the repository so once you commit all your changes are safe and you can always rollback to this point)
4. Create a branch to try a new idea.
git branch new_approach_name
(A branch is a copy of your code where you can work without modifying the master copy. This is ideal to try new approaches in a Marathon Match)
5. Switch to that branch
git checkout new_approach_name
6. Work on that idea:
- If you improve your score then commit your changes and write a useful message to know what did you do
git commit –a -m "Dijkstra's algorithm used to find the shortest path – New total score: 89.3"
- If your changes don’t improve your score then you can revert them with the following command
git checkout path_of_the_file_to_revert
7. Merge the idea into the main code using the merge command.
git checkout master (switch back to the master branch)
git merge new_approach_name
(This can generate conflicts in the master branch files, if this is the case you can then edit those files manually and then commit. If you don’t want to deal with the merge command then you can always switch from branch to branch using the git checkout command and copy and paste manually the parts of code you want.)

Other useful commands are:
git branch
This command will show you the list of existing branches, the current branch is marked with “*”

git diff
This command will show you the differences between commit and working tree

git log
This command will show you the list of commits that you have done

git revert commit_name
This command revert the changes made by a commit command

You can always do all this manually but once you learned how to use a revision control tool you will save a lot of time and you will also learn something that is very useful for real projects because these tools are what companies use to develop software.

Discussion

An important matter when you run tests is the number of test cases. The more test cases you test the more accurate the statistics will be but also it will take longer to perform all test cases, this will depend on the problem and the variance of the solution but usually 100 test cases will give you a fair approximation and 1000 test cases will give you a solid statistic.

Some variables that are useful to measure for each solution are: the final score, the mean, the standard deviation, how many test cases scored zero, the minimum and maximum score and the maximum time that the solution takes to return an answer. All these variables will give you a better insight when comparing two solutions.

You will often use algorithms that make use of Random generated numbers, it is very important to choose a fixed seed so you can compare different submission.

Avoid zeros, timing out or giving a wrong answer will score zero, this is an expensive error that is easy to avoid most of the time if you have set up your test environment.

When you make a test submission on the TopCoder server you will receive an email with your scores for each test case but also a link to “Example Results” which has more information like
 3.2 Setting Up Your Environment/Keeping Track of Your Progress | Reply ProblemMarathon Matches, as the name suggests, require a lot of time and effort, it is not enough to just submit a solution, you need to constantly make changes to the solution during the contest in order to improve it. After you spend a good amount of time improving your solution you will realize that a lot of that time and effort is spent on organizing the versions of code, deciding which solution performs better, recalling the changes done between two submissions and reverting them. All this can be avoided by organizing your solving environment properly.SolutionSetting Up your Test Environment.Choose Your Preferred Integrated Development Environment (IDE): Be sure to choose a good IDE because Marathon Matches usually requires writing a lot of code and a good IDE with a good debugger will save you a lot of time. I would recommend Eclipse for C++, Java and Python and Visual Studio Express for C++, C# and VB. Both are free and great IDEs but this is just a recommendation, you should choose the one you feel comfortable with.Write a Naïve Solution: Once you have understood the problem statement the first thing you need to do is to write a naïve solution. By naïve solution I mean the simplest solution you can write, it doesn’t matter if the score of this solution is low, it is only important that it is a valid solution (gives a score greater than zero) and that it has as few lines of code as possible. An example will be just to return zero or an empty string, it will depends on the problem. For a concrete example take a look at the problem PolimerPaking, here the most simple solution will be to return an array of integers with length zero (return new int[0];), this is not a great solution but it is a valid solution that gives you a score greater than zero and will allow you to set up the environment.Test Your Solution: After you wrote your naïve solution you will need to set up your test environment and test that solution. There are two ways to test your solution against TopCoder specifications, you can submit your solution to the TopCoder server or you can Download the Visualizer Tool and test it locally, this is much faster and flexible because you can run the visualizer in your computer as many times as you want.Implement Your Own Tester: When you have your naïve solution and you tested that it gets a score greater than zero in the visualizer or in the TopCoder server then it will be a very smart move to implement the visualizer code directly in your solution, this will allow you to run more tests faster without using standard input and output to exchange information with the visualizer, it will also allow you to calculate the mean and variance of your scores and do other statistical analysis. It is very important when you rewrite the code of the visualizer to be sure that it returns the same score as the visualizer and the TopCoder server.