||How did you approach this problem?
I have to say that this contest was very close to being a total disaster for me, because for the most time (maybe the first 10-11 days) I had no idea what I could do to reach a score above 700K, which many competitors had (let alone the 795K top score).
During this time I tried many geometric approaches (either greedy or backtracking-based), but they all failed (even if they were very good in the beginning, they would invariably create an edge with a too high or too low ratio, which could not be fixed eventually).
I also read many papers about "distance geometry" and embedding graphs into Euclidean spaces with low distortion. Most of the approaches try to optimize for a sum of edge "badness", not for the min/max case. I even tried the MDS (Multidimensional Scaling) technique implemented in Python's sklearn package and it was very bad (min ratio ~ 0.1, max ratio ~ 3.0 on most test cases => very bad score).
Note that all these approaches used real coordinates, not constrained to lie inside the 700x700 grid (because I might construct some rotated/translated solutions).
Then I also thought about trying a kind of force-based approach, with vertices pulling/pushing each other if they were too far/too close. I implemented a crude version of this (each vertex pulls/pushes itself along the straight line between them with equal force, i.e. they are both pushed/pulled to cover the needed distance equally, until they reach the exact required distance), which I iterated as long as time allowed (always computing displacements for all the vertices first and then applying them) and this was the first thing that started scoring a bit more decently (~500K after running it on the output produced by a greedy). But I didn't think this could be pushed much further, so I kept trying to find other, smarter, approaches (e.g. greedy with some look-ahead, splitting the grid into zones in order to get an approximate zone for each vertex first, etc.).
As the contest was getting closer and closer to the end and I wasn't coming up with any new brilliant ideas, I decided to get back to the force-based approach and see what more I could squeeze out of it. It turned out that it had a lot more potential than I initially estimated (so much so that my final solution is running just multiple versions of the following "core" idea:
1) pick 2 intervals [min_ratio1, min_ratio2], [max_ratio1, max_ratio2], a number of iterations NI, and one or more starting vertices (or just an edge) and put them in a queue
2) For iteration = 1 to NI:
2.1) choose min_ratio and max_ratio inside the given intervals (so that min_ratio goes from min_ratio1 to min_ratio2, and max_ratio goes from max_ratio1 to max_ratio2, as the number of iterations increases) - we should have min_ratio2 <= min_ratio1 and max_ratio2 >= max_ratio1, in order to impose fewer and fewer constraints on the system and allow it to potentially cool down
2.2) Pick the node u from the front of the queue
2.3) Update u's coordinates based on the coordinates of its neighbors (if they're further than max_ratio, pull them to max_ratio distance; if they're closer than min_ratio, push them to min_ratio) - the update will modify the coordinates of both u and its neighbors
2.4) Add the nodes with modified coordinates to the queue (if not already there) - optionally, use a priority queue and always pick a node corresponding to an edge with the current minimum or maximum ratio.
- when pulling/pushing, one may choose to push/pull more (with an additional random distance) up to the best targeted min/max ratio (which should be min_ratio1 and max_ratio1)
- the strength of pushing/pulling needs not be identical for the two nodes (e.g. when pushing, one may move some random part of the additional distance, while the other will be moved along the remaining part)
- starting nodes/edges should correspond to edges with the current min/max ratio
- the system doesn't necessarily stabilize at an equilibrium, but along the way passes through many good states; thus, every K iterations check the current score (K can be quite small, because computing the score by iterating over all the edges can be interrupted once rmin / rmax is lower than the best score found so far; or we can compute the score quickly, by maintaining the current min-/max- ratio in a data structure, at the expense of many updates)
As before, I used real coordinates with no bounds for them, so every time I found a new good score, I had to try to map it to the discrete grid. I didn't do a very good job here, unfortunately. I simply tried multiple scaling factors, rounding the coordinates, making sure they're unique (move them around a bit) and then fix them by a simple hill climbing (if at all possible). I didn't try to do any rotations, although they would probably have helped.
My scores on real coordinates (and without coordinate limits) are about 4K larger on average than the scores obtained by mapping them on the discrete grid. When the minimum given distance between 2 points was 1, I forced all the points to stay inside the 700x700 grid (but not at integer coordinates), because in this scaling down would have lead to very bad scores (max ratio cannot be lower than 1 for these tests).
In the end I simply executed multiple versions of the "core" algorithm "in parallel" (i.e. one set of iterations at a time), because I noticed that they were usually finding their best solution quite fast and after that they would get stuck (with further improvements coming along very slow, if at all). So running multiple versions of them helped in finding diverse solutions. For the end I took the best solution and ran yet another version of this "core" algorithm on it (which was better for incremental improvements, but could not reach very good scores if starting from some random configuration).
As far as the starting configuration, in the end it was better to start from fully random coordinates than from the output of a Greedy, so I essentially deleted all the code I wrote in the first ~10 days of the contest.
Provisional rank: 10
Provisional score: 788593.97
3) 0.7518743272898021 (score before mapping to grid: 753730.357)
4) 0.7987411114792682 (score before mapping to grid: 806515.258)
5) 0.7999212093883721 (score before mapping to grid: 803164.277)
8) 0.789571138295211 (score before mapping to grid: 794892.927)
9) 0.7559114170451269 (score before mapping to grid: 758295.691)
||Good for you, I did not find anything that worked.
I found out that for edges not satisfying triangular equality like
0 1 1
1 2 2
2 3 3
3 4 4
4 0 20
the optimum is placing the vertices on the straight line, all smaller edges having the ratio equal to 1
and the value of optimum is then the ratio of the biggest edge (in this case 0.5)
So I computed the shortest distances between vertices and found out the theoretical optimum by taking the smallest ratio of all edges that are longer than shortest distance between its vertices. For seeds 100-599 the average of this theoretical optimum is 822168.84
That is when my jaw dropped as top guys practically have this.
But as I said, nothing I tried worked anywhere near that. I even coded the Multidimensional Scaling (using power iteration for first eigenpair, then Hotelling Deflation to find the second eigenpair also using power iteration) as I was certain that is the solution. Not ;)
I also tried springs simulations, Stochastic proximity embedding (stretching/expanding random edges to bring it closer to ratio 1), Stochastic proximity embedding interval (stretch/expand only if the ratio is not in desired interval which I choose to be theoretical optimum and 1), SA on moving vertices by random angle, and variations on those.
In the end I learned a lot (like Cayley-Menger coordinates) except not how to achieve a good score ;) So I do not take it as a failure.
||I have to say I really enjoyed the problem even though my solution was no where near the best solutions.
1) Create an initial greedy solution where I tried to place points that had the maximum degree and its adjacent vertices optimally.
2) Then use SA by picking one of the four vertices that belonged to the max ratio and min ratio edges, and try to place that vertex randomly on the grid so that it improves the solution and update the four vertices.
The thing that was the major bottleneck in the solution was computing the score after each state change. I tried to improve it by only traversing the vertices adjacent of the candidate vertex and got some improvement from that.
I also implemented SA for the first time it was performing better than hill climbing so I have to say it was a good learning experience. Another thing I noted that while using SA when I changed the candidate vertex to any of the surrounding points on the grid, then solution was much worse than picking random point on the grid.
Provisional Score : 369781.98
Provisional Rank : 66
||I also computed the theoretical optimum like you did (pretty early on during the contest) and was also amazed (at the time) that the top scores were indeed very close to it. On the tests that I used I can say that my scores (of the final solution) are ~45K away from optimum on average (with some tests reaching it almost perfectly and others being more than 100K away).
||Prev score: 776566.38, 17th place.
I tried to find out some sophisticated method, but none of them worked well.
Finilly I selected very simple elements and trust in speed. I have two stage:
2. randomization (with special manner)
1. Climbing: select the worst edge and try to make it better. The rate target was 0.87 to 1.13 first then narrowed when found better solutions. Note that worst edge selection is time consuming, it costs about 40 % of total time. (It could be possible to find a better way but I run out of my time.)
2. Randomization: select a random edge and make a large change. At the begining of randomizations stage the change was even larger then it is necessary to set optimal value. (eg. real length 100, target length 200 then I move both edge with about 95.) I take at least 2500 steps of randomization before starting to climb again.
I do not optimalize for small distances and other special cases. My program is still quite random. Selecting the best result from 5 run could result about 787000 points. (That meas that a speed up of 5 times could take at least 10 k more points.)
I am interested whether top players has any good idea, or just do faster program.
Note that I prefer problems where it is possible to utilize more idea and less speed optimalization.
||For worst edge selection I kept black-and-red tree (set in C++) with the worst edge sitting at the root. The nodes were integers pointing to entries of array holding edge specification. I would pop the worst edge, try to improve it, modify the array in relevant places (there were not that many) and pushed the node back on the tree. Actually it is hard to tell what the "worst" edge is. A perfect graph is not necessarily with each edge ratio equal to 1.0, it could be 0.8, 1.1 or any other number for that matter. So I kept two trees, one with ascending ratio and the other in descending.
||I also struggled with this task. I experimented with a few approaches, but I was used in the end is the following:
* I pick a scale s (between 0.8 and 1.0) and start with threshold x=0.6 that will be increased over the course of the algorithm.
* As long as there is an edge with ratio of current length to desired length smaller than x * s or greater than s / x, I pick a random such edge and a random its endpoint (call it v) and I perform one of the following two steps:
for each neighbour u of v if the edge uv does not obey the limit, I move the point u the smallest possible distance along the vector uv to make the edge happy,
Move 2: I pick two random neighbours u1 and u2 of v, I assume that u1 and u2 will not move and I need to move v. I calculate the crossing of two circles - one centered at u1 with radius being equal to the target distance of (u1,v), the second centered at u2 with radius equal to the target distance of (u2,v). This circle intersection gives two potential locations for v and I the random one.
On top of the above moves I have SA where the objective function is the number of unsatisfied edges. When there all the edges are happy with respect to the current threshold x, then I raise x by 0.06. The whole process is repeated a few times. Unfortunately adding more time to my approach did not help significantly, so it was clear to me that what I did was suboptimal.
||I had rather unusual situation. My time during this match was severely limited and most probably I would not participate if not for the grand-prix system that was introduced this year. This was my experience :)
That being said, I was very lucky with my first submission (~90 minutes, ~750K): I started with a simple SA (random initial placement, move a single vertex randomly as a transition/neighbor function) on the (more or less) final scoring function, which did not give any decent results (around 200K after some fine-tuning). I figured out that it's just very easy to get stuck in local maximum, so it would be best if I could start from original position. In order to find state that should be close to original position, it's enough to do SA on absolute differences between actual distances and given distances. So my solution does this first and then switches to original scoring function.
My third submit (~5h, ~786K) was almost the same, but I just optimized code, tweaked formulas, nothing really big.
After that, I decided that I won't be doing any big changes and I will only stick to introducing very small changes and running big number of tests on EC2 (2K tests on m4.16xlarge took 12 minutes). I tested every stupid thing that came to my mind and with the help of big number of runs (probably around 200 runs of 2K tests), I managed to move to ~792K.
On the last day, I noticed that increasing run time twice increases my score by about +900 (and +1600 for quadrupling it). I didn't try to optimize my code after the first day, so I figured out that this might be a good idea. I did a lot of crazy optimizations, and in the end my code run around 60-70% faster on my EC2 instance, but for some reason it was slower by 20-30% on TC's instance. And I still have no idea why.
The bottom line is, I feel a little bit as a cheater, since essentially I just threw bunch of $$$ at servers in order to have stable results and focused on ridiculously small changes. Especially since I feel that the core of my solution is very far from optimal (for example the first part could be done more effective with the use of algorithm similar to what mugurelionut did).
||That was a great problem!
It seemed to me that will be too "speed dependent" at the beginning, but it gave room for many different ideas.
My final solution is just:
1) Build a random solution
2) Improve it, moving one vertex at a time, using SA.
Although that is what many people implemented I guess, what gave me good score improvements were the following decisions:
a) Which vertex to move:
I keep a list of all edges sorted by their ratio (actual / expected distance).
Only 4 of them move, two from each end of the list, i.e. the worst ones, with the smaller or the larger ratios.
From the chosen edge, I take one of the two edges randomly.
An implementation note: after an accepted move, this list is updated.
That has a huge cost (usually consume half of the running time).
I tried to use a TreeSet (which I believe uses a red-black tree), but in the end, it was slower.
Since my list is ordered, it uses a binary search to find the current and new position of the edge that was updated (edges from the moved vertex).
But after that, it needs to move all items in the list between the initial and final positions.
So, for large cases (with many edges), this is really a problem.
b) Where to move:
Instead of picking a random position nearby the current one, I realized that my solution worked much better, especially in the early stage of SA, taking the "expected" position from 3 (or 2) edges of the vertex.
Explaining this idea a bit (although I think many people use similar approaches)...
My solution keeps a "target" value of the ratio (explained below, in the least item).
So, if I take two edges of the current vertex that I am trying to move, the two other vertices are fixed at this point.
If the current median ratio is M, the desired distances between the two fixed vertices and the moving vertices are M.D1 and M.D2 (where D1 and D2 are the expected distances).
Then with two points and two distances a circle-to-circle intersection calculation is made, which gave usually 2 intersection points (or 1, or none).
In the case of none (vertices are two far from each other, for example), I take a point in the middle which minimizes the error between the desired and actual distances.
If the moving vertex has more than 2 edges (which was usually the case), I repeat the process taking another pair of edges, and then choose a single point that is closer to 2 of the candidate points (in a perfect scenario, all circles would intersect in the same point).
This point that came from the circle-to-circle intersection is averaged with the current point, giving an increasing weight as time goes by to the current point.
And finally a random shift is made (using a gaussing value that is also descreased with time).
c) SA objective and initial temperature:
I used a linear cooling schedule, with two evaluation functions: the main one is the actual score (global, i.e. all vertices, min/max ratio before and after the move.
If the score is the same, there is a second evaluation based on the difference from the median only for the edges of the current moving vertex.
Initial temperature has a huge impact in the score, since it was easy to get suck, but a too high value loses time and prevents it reach a good score on time.
I ran a lot of tests, with different temperatures, and realized that it was highly dependent on the average number of edges per vertex, and used different values (higher for sparse cases).
d) Target ratio between actual and desired distances:
Initially I believed that the best solution would have ratios near to 1.
After removing some code that was "forcing" the solution in this direction, I observed a good score improvement.
On the other hand, for some cases, my solution tended to "shrink" everything, and ended up with points very close to each other (sometimes ratios as low as 0.1).
This was not good because if you are working on a 70x70 grid, a single unit changes the ratio a lot.
I added then a "target" ratio of 0.9.
This is used as an initial ratio (in the first 30% of the time or so).
For finding candidate points (described in item b), the used ratio is in fact averaged between the current median of all edges ratios and this constant value.
That worked well, and produced ratios around this value.
The only cases that I noticed that this fixed value hurted the score were those containing edges with an expected distance of 1, which will require a ratio of 1 or more.
I implemente some code to handle this corner case, but decided not to keep it in my final submission (and probably wasted some points here).
||I think the following approach may speed up search for the worst edge to process.
1. Order all edges
2. Pick a ratio threshold (s) to get you a number of edges you want to work with and place them in a smaller cache
3. Process edges from the cache which hopefully you can order and handle faster
4. Once you displace a vertex recalculate all affected edges, check against the threshold and either add them to the cache or throw away
5. When the cache is exhausted go back to step #1
||In fact I think about keeping just part of the edges in a faster structure, but in practice it didn't help.
During the SA process of my solution, edges in such situation keep changing too frequently, and in the end this kind of cache structure didn't have a high "hit ratio".
And whenever the new position in the list is close to the previous one, very few item (possibly none) need to be moved, which is already fast.
||I was very short on time and I didn't even try to submit, so only tested locally. My initial approach was basically to randomize initial positions then optimize using a gradient descent on a score function similar to the goal.
I also realized I got stuck easily on local minimum so I tried to improve randomization using genetic algorithms to no good. I blamed the algo but the I realized I didn't properly mapped the problem into the GA. Maybe I'll try to fix it later just to see if it works. As I got stuck, I also tried multidimensional scaling approaches. Force-directed, eigenvalues, etc. None worked.
But this is a fun problem. I'll see if I can get a decent local score using some of these ideas here.
I added then a "target" ratio of 0.9.
Reading that line made me feel really really dumb :) I did consider it in the beginning but I simply forgot about that.
Multiplying all of the given distances by 0.9 (single line of code) increases my score roughly by 5K. Which means that either my provisional score is inflated or my "stupid" solution is not that stupid after all :)
||I was also trying to split the solutions into two parts: restoring original point coordinates and improving towards score. Second part was just hill climbing since improving it didn't make much sense as long as first part was performing terrible.
I was trying many kinds of simulations to place points minimizing sum of abs(ratio-1) but later realized that it would never work for sparse graphs. I remember trying SA at some point but probably I had a bug somewhere and it didn't give me anything. Using SA as Psyho mentioned there instead gives me enough boost to place in top 15 after which many things can be improved.
Lesson learned: put more trust in SA instead of reading stupid papers