JOIN
 Select a Forum     Round Tables New Member Discussions News Discussions Algorithm Matches Marathon Matches NASA Tournament Lab TopCoder Cookbook High School Matches Sponsor Discussions Development Forums Design Forums Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings View: Flat (newest first)  | Threaded  | Tree Previous Thread  |  Next Thread Forums TopCoder Cookbook Marathon Competitions - New Recipes 3.2 Setting Up Your Environment/Keeping Track of Your Progress
Re: 3.2 Setting Up Your Environment/Keeping Track of Your Progress (response to post by amiune) | Reply
Testing and Tracking your Solutions.

Now you have your test environment set up and a couple of ideas to attack the problem, it is time to translate those ideas into code and test them. Keep the score of your naïve solution as the base to improve in the next submissions.

Saving Test Results

With your local tester you will have the ability to print the results in a file for later processing and analysis. Let suppose that you have decided that 100 test cases will give you a good statistic to compare different solutions. Then you can automate your local tester to process those 100 tests at once and print the results in a file using comma separated value format, xml or any other format that you like.

A CSV file of the results can look like this:

Test Case, Input Parameter1, Input Parameter2, …, Input ParameterN, Score, Run Time
1,23,”A”,…,245,3048.0,2340
2,45,”S”,…,347,588.3,2010

100,65,”S”,…,856,2488.9,4853

Having all this data for each solution will allow you to make comparisons like in the following example:

 Test Case Input Parameter 1 Input Parameter 2 Score Solution 1 Score Solution 2 1 23 A 3048.0 1045.3 2 45 S 588.3 1039.4 3 35 A 2695.4 987.4 4 87 S 659.4 985.8 5 43 S 458.2 704.9 6 86 S 674.1 830.4 7 24 A 3485.2 2346.3

In this example you can infer that the Solution 1 scores much better than Solution 2 when the Input Parameter 2 is equal to A and Solution 2 scores much better than Solution 1 when the Input Parameter 2 is equal to S, so you can merge the two solutions to improve your score by doing something as:

if(inputParameter2 == “A”) runSolution1();
else runSolution2();

This is just a simple example to show why is important to write your own testing program and how you can take advantage of it.

Keeping track of your solutions and their score is very important because a lot of times you will end writing a solution that scores worst than previous ones so you will want to rollback everything you did.

The simplest way to track your solutions is to make a full submission to TopCoder server every time you think you have improved your solution. This way all your submissions code and their score will be saved in TopCoder servers and then you will be able to see each solution score download any of them. This is the simplest way to track your solutions but it has several drawbacks, you don’t have much detail about what changes did you make to each solution, you only have the final score of your solution and not a detailed score per test case and if the score is relative to the score of the other participants then it will be hard to compare your different solutions.

The best way to track your solutions is to use a revision control tool, also known as version control tool or source control tool. With these tools you can save every little change that you make to the code and add a note of why are you doing that, you can also rollback any change you made. Another useful feature of these tools is that you can fork your base solution to try different approaches and then merge into one.

There are several free tools to do source control, two of the most used tools are:
GIT (http://git-scm.com/)
Subversion and TortoiseSVN (http://tortoisesvn.tigris.org/)

GIT is more popular in Unix like environments and SVN is widely used in Windows environments.

The purpose of this recipe is not to explain these source control tools in depth so the following is just a brief tutorial that can help you to get started fast with GIT to work on your Marathon Match code:

1. Install git
2. Create a git repository where your project folder and files are. To do that just change directory (cd) to your project folder and then run the command
git init
2. Add all your files to the repository by using the command
3. Commit your changes to the repository
git commit -m "Initial commit"
(This command will record your changes to the repository so once you commit all your changes are safe and you can always rollback to this point)
4. Create a branch to try a new idea.
git branch new_approach_name
(A branch is a copy of your code where you can work without modifying the master copy. This is ideal to try new approaches in a Marathon Match)
5. Switch to that branch
git checkout new_approach_name
6. Work on that idea:
- If you improve your score then commit your changes and write a useful message to know what did you do
git commit –a -m "Dijkstra's algorithm used to find the shortest path – New total score: 89.3"
- If your changes don’t improve your score then you can revert them with the following command
git checkout path_of_the_file_to_revert
7. Merge the idea into the main code using the merge command.
git checkout master (switch back to the master branch)
git merge new_approach_name
(This can generate conflicts in the master branch files, if this is the case you can then edit those files manually and then commit. If you don’t want to deal with the merge command then you can always switch from branch to branch using the git checkout command and copy and paste manually the parts of code you want.)

Other useful commands are:
git branch
This command will show you the list of existing branches, the current branch is marked with “*”

git diff
This command will show you the differences between commit and working tree

git log
This command will show you the list of commits that you have done

git revert commit_name
This command revert the changes made by a commit command

To learn more about GIT you can go to http://git-scm.com/documentation

You can always do all this manually but once you learned how to use a revision control tool you will save a lot of time and you will also learn something that is very useful for real projects because these tools are what companies use to develop software.

Discussion

An important matter when you run tests is the number of test cases. The more test cases you test the more accurate the statistics will be but also it will take longer to perform all test cases, this will depend on the problem and the variance of the solution but usually 100 test cases will give you a fair approximation and 1000 test cases will give you a solid statistic.

Some variables that are useful to measure for each solution are: the final score, the mean, the standard deviation, how many test cases scored zero, the minimum and maximum score and the maximum time that the solution takes to return an answer. All these variables will give you a better insight when comparing two solutions.

You will often use algorithms that make use of Random generated numbers, it is very important to choose a fixed seed so you can compare different submission.

Avoid zeros, timing out or giving a wrong answer will score zero, this is an expensive error that is easy to avoid most of the time if you have set up your test environment.

When you make a test submission on the TopCoder server you will receive an email with your scores for each test case but also a link to “Example Results” which has more information like
 Forums TopCoder Cookbook Marathon Competitions - New Recipes 3.2 Setting Up Your Environment/Keeping Track of Your Progress Previous Thread  |  Next Thread