||I had the same thoughts. But what are the other options? I mean, when you state these conditions:
- you don't want to release the test data to the contestants,
- you don't want to invite all the contestants to the final testing round,
you can hardly come up with a better mechanism.
Maybe one way to improve would be, that in the final tests, you would be allowed to select not the last submission, but any of the submissions you made. So you would be forced to overfit (to reach top 10 in provisional), but at least for the final tests, you would be able to select better solution (but still the one that was prepared before the deadline).
Another option would be to split the provisional data so that provisional ranking would be based on one part of the data, but the invitation for final tests would be based on the other part. But that would imply three different rankings: Provisional 1 (with overfit), Provisional 2 (without overfit) and Final (based on unseen data). The Marathon platform currently does not support that, so something would have to be done manually. Another drawback is that when the provisional data set is not very big, it is not a good idea to split it.