JOIN
Get Time
forums   
Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings
View: Flat  | Threaded  | Tree
Previous Thread  |  Next Thread
Post your approach | Reply
My basic algorithm has three stages:

1) Find a point in the polygon
2) Expand it to a triangle
3) Pick a likely side and see if there is more polygon beyond it

I mostly choose lines and do binary searches for the polygon boundary along those lines. For each vertex I create, I keep the in point (the closest one to the boundary that is inside the polygon) and the out point (the closest one outside the polygon). Everything is based around the fact that with a convex polygon, you can't have a line go from inside the polygon to outside and then back in. I use this to find the largest triangle that could have a given side of my current polygon as its base. I also use it to move my in and out points closer to the polygon boundary without using more queries (well, sometimes I query the midpoint if it seems the best thing to do).

My code is available here: PolygonArea_jdmetz.cpp And it is pretty well commented.

It gets 4.9842 on average over 100000 cases.
Re: Post your approach (response to post by jdmetz) | Reply
My strategy is based off a few theorems stemming from the fact that we are looking at a convex polygon (proofs are left to the reader):
Let points inside the polygon be good points, all other points are bad points
T1) Any point on the line segment AB, where A and B are good points, are themselves good (we shall call this a good segment)
T2) The convex hull of a set of good points must also be good (we shall call this a good polygon)
T3) There must be exactly one 'border' between a good point and a bad point
T4) (The major theorem) Given a good segment AB and a bad point C, draw two infinite rays starting from C in the direction of vectors AC and BC ...every point in the area between the two rays you just drew is also bad (including the rays themselves) <-- this allows rapid elimination of areas
T5) (Another big one) Given a good segment AB on a good polygon, draw one infinite ray from any point on AB in a direction headed outward. There must be a border point somewhere on the ray (possibly on AB itself), after which all points are bad <--- this allows binary search

How I used these could have been better, but anyway:

1) Pick and test random points until you have three successful points.
By far the weakest part of my algorithm...if I didn't find three points after N measurements, I did a variety of things, none of which worked well
2) Repeat the following until N measurement overall have been done:
a) Find the convex hull of all good points (using Graham scan)
b) Take each vertex and find the area gain I would get if i extend that vertex out from the centroid. Take each midpoint of each side of the hull and extend it perpendicularly and see the area gain. Pick the point with the most possible area gain.
c) call isInside with this point of maximal potential gain, update arrays of good and bad points
3) (This led from 2444->2469) Now take the midpoints of each segment of the convex hull, and extend it perpendicularly one last time. Find the area of this new polygon. Return the average of the area of the new polygon and the convex hull (almost always the true area was somewhere in this range).
4) (This stunningly led to 2469->2479) Take one or two thousand results....plug the real (not absolute differences) into Excel (not 2007!) and graph as a function of N. Fit a curve to the data (logarithmic worked well for me) and put the result (which is a function of N) as a fudge factor to your answer....
5) (2479->2481) Return the weighted average between the smaller area and 6 times the larger area.
Fudge, indeed, is yummy. I hope this is the start of a bunch of good marathons for me!

EDIT: wow, my approach is somewhat similar to jdmetz's....except that I always include the midpoint in my deliberations as to which point to attempt an extension from, which is probably wasteful in some cases
Example results:
0) 4.990047963928504
1) 4.9913226200532925
2) 4.970696142925412
3) 4.9928553256184856
4) 4.982272505400221
5) 4.907782431935212
6) 4.963737267020377
7) 4.993254406561572
8) 4.998868686733182
9) 4.981794885452505
Re: Post your approach (response to post by jdmetz) | Reply
The idea behind my approach was to find a point in the polygon and then discover it by performing binary search from that point in different directions.
This method gives us a nice simple formula to calculate the area.

N.b. each binary search is performed until the difference between the upper and lower bounds is 0.1

Step 1: Find the polygon:
Try (5:5) then random points until we find the polygon. If we dont find the poly return a guess of 1.

Step 2:
perform binary searches toward the left and right (from our discovered point) to find two points on the edge of the polygon.
choose the point half way between these points for the start of all our binary searches.
perofrm BSs towards the top and bottom from our new chosen point.

Step 3:
Now we have found 4 points on the edge of the polygon we can continue to discover the rest of the shape.
We want to perform more binary searches that bisect two adjacent discovered points.
For each possible new search we look at the last two points and the next two points and calculate the uncertainty in the total area from this gap.
This allows us to perform searchs in the places that will be the most useful.I.e. perform searches to discover corners rather than straight lines.
Each new BS is initialised with the max and min posible values based on information from the exising BSs.
This is performed until we run out of enquires.

Things I never got around to doing
Varing the precision of the BS based on the number of enquires we have been given.
Using a more structured search to find those small polygons.
Using a better method to find a point within a polygon or using more than one internal point to start BSs on.
Re: Post your approach (response to post by jdmetz) | Reply
My approach is similar.

1. Start with a grid size of 64x64 and find an inside point
2. Using binary search, go from that point to the left, right, up and down and
find the the intersections with the polygon edges
3. Expand the polygon edges one by one, in the order of their size, until
there all the edges are greater than a given threshold. We expand an edge by
applying binary search from its middle point to the direction perpendicular to
it
4. Repeat
pick the best unknown point and query it
until there are either no more unknown points or no more queries left

The "best" point is the one that will minimize the total number of unknown points if we would know its value. This is computed as the minimum between
a) the number of currently unknown points that are situated inside the outer "cone" created by the point we want to query and the convex hull of the inside points found so far, if the query will be negative (point is outside)
b) the number of unknown points that will be added to the convex hull of all the inside points found so far, if the query will be positive (point is inside)

If multiple points are tied for "best", pick a random one among them.

5. If there are queries left, than expand the grid (to 128x128, 256x256, or 512x512 depending on how many queries are left) and repeat from step 3

6. Estimate the area, using the following formula:

area ~= (n_inside_points + n_unknown_points_adjacent_to_inside_points/2)*grid_cell_size.
Re: Post your approach (response to post by jdmetz) | Reply
1) Find one, two or four points inside the polygon (depends on available M ) Take average, this is the center of the polygon.
2) Then I do 4 binary searches to find the points to where the polygon intersects with a cross that got the "center of the polygon" as center.

3) I keep the 4 points in an array and then start a loop:
- while there are enough M to do tests:
- pick the two consecutive points in the array that got the largest distance.
- Try to find the intersection from center to the middle angle between those points.
- Add new point to the array.

4) When it ends I take the convex hull and calculate area.

2359.25 points in preliminar tests.
Re: Post your approach (response to post by jdmetz) | Reply
My approach is:

1) Construct a triangle by checking random points.
2) Use binary search from the centre of the convex polygon, constructed by now and find a boundary point. Centre is defined to be the average of all the points of the convex polygon, which I have found.
3) Repeat 2 until there are queries left.

This approach got 2251.59 on the 500 full submission test cases.
Re: Post your approach (response to post by jdmetz) | Reply
I maintain two sets of points - inside, and outside.
Each outside point assumes a sector of outside area to preserve convexity of the polygon.
Both sets are non-redundant, so for the inside points I keep only vertexes of their convex hull, and for the outside points I remove all points covered by outside sector of another point.
This gives me minimal and maximal areas of the polygon.
Then I randomly choose a new query in unknown area (not inside, or outside) such that it will maximize possible reduction of the unknown area for two outcomes - if the new point is inside or not.
At the end I return some value between minimal and maximal areas depending on total number of queries (at .5-.8).
There is a problem of rare cases when all queries missed the polygon.
In this cases I return 71/M - as it gives the best approximation.
My average is about 4.982.
Re: Post your approach (response to post by venco) | Reply
My approach was identical to venco's except in determining where to sample between the inside and outside polygons: I rolled a big ball around the perimeter of the inside polygon, and if the ball did not intersect the outside polygon I used the centre of the ball as the next sample point. If I could not find any such sample points I would reduce the size of the ball a small amount and repeat, thus always sampling the biggest open area between the two polygons.

This unfortunately only got me 36th place provisionally. The main reason it performed so badly I think is that it didn't sample sharp corners very well, unless there were plenty of sampling opportunities available.

Oh well, another day...
Re: Post your approach (response to post by venco) | Reply
Mine's pretty much exactly the same as venco's.

Interestingly, I found if I increased the number of sample points from 1000 per guess to 2000 per guess, when trying to maximize the possible reduction of the unknown area, this caused my solution to perform *worse*.

It turns out that maximizing min(area-reduction-if-true, area-reduction-if-false) is suboptimal, as these points have a significantly greater than 50% chance of being "true".

I didn't understand this until too late to incorporate any changes into my code. So I just grabbed one of my earlier submissions and resubmitted it.

Also, the curve fitting at the end may have helped a great deal (or hurt, depending on how well you did it).

Oh, I had a reasonable approach to finding the first two points; the first point I tried to maximize the distance to the closest prior guess and the edges, and this worked pretty well. For the second point, I tried to maximize the distance to the closest prior guess, while ensuring that the closest prior guess was indeed the point we had already found inside the polygon. After I had two points inside the polygon, I could proceed with my normal inside/outside work.
Re: Post your approach (response to post by radeye) | Reply
I have found that maximizing min(gain_if_in, 2*gain_if_out) is better, but not much.
As for number of samples, I didn't see significant difference, but 160 in my last submission is a bit better than 40 in earlier.
I think reducing randomness is the way to enhance our approach. My current code produces large number of hull and outside points at the end, way more than number of real polygon vertexes.
Re: Post your approach (response to post by venco) | Reply
probably this coefficient of gain_if_out should be equal to (1-in_f)/in_f, where in_f - the average fraction of area inside the polygon. I don't know how much that exactly is, but 0.33 (giving 2) is probably a good guess...
Re: Post your approach (response to post by radeye) | Reply
Mine was very similar to many already posted:
* find some points in the polygon
* expand the convex hull of those points by querying points that give new information (i.e. aren't inside the known convex hull and don't cause the new convex hull to contain known outside points)

I choose new points in 3 different ways in my final solution (all references are to the convex hull of the inside points):
1) Pick the midpoint of an edge and expand along the normal of that edge
2) Take each vertex and expand it away from the centroid
3) Take each vertex and expand it along an edge it sits on (i.e. pick an edge and stretch it in one or the other direction)

The expansion distance is calculated by doing a binary search to find the maximum distance the point can be expanded without including a known outside point, then expanding half of this distance.

To calculate the distance at the very end I return a value interpolated between the area of the inner convex hull and that convex hull "blown up like a balloon" (without querying) until it's touching the outside points.

I also tried generating points based on possible corners expanded away from the nearest point on the convex hull, where a possible corner is any extrapolated intersection of two edges that is considered interesting (gives new information).

Edit: I forgot to mention that I pick the point that maximizes the area of the inside hull should the point be inside. There's probably a much better way to do this.
Re: Post your approach (response to post by venco) | Reply
My approach is an adaptive triangulation, inspired by adaptive finite element algorithms. Considering that I did not take into account the convexity, I think it works quite well.

The algorithm starts with a hand-picked initial triangulation which is a very simple one. Then in each step:

* Each triangle is assigned an error, depending on the area and on how many of its vertices are inside the polygon.
* Then these error are distributed among its edges, depending on the edge length.
* Edges with biggest errors are marked for refinement so that the marked edges constitute some fixed percentage (I chose 20%) of the total error.
* Marked edges are refined: The edge is divided in half and so are its neighboring triangles.
* For the newly created vertices, inquire if they inside the polygon or not. If a new vertex is on the boundary of (0,10)x(0,10) then automatically it is outside the polygon (to save the question for other 'important' vertices).

After the loop, the area is calculated by appropriately dividing the 'boundary' triangles.

I tried to implement an approach like venco used, but my coding efficiency did not allow me to finish in time. In this, the errors on edges would depend on if a part of the edge can 'see' the convex hull of the interior points through the net of exterior points.
Re: Post your approach (response to post by venco) | Reply
I did the same as venco but I tried to maximize distance to inside and outside regions
Re: Post your approach (response to post by jdmetz) | Reply
My approach was to build an integer grid of size MxM. Then I would find a point within the polygon and then another. Using the fact the any two points in the polygon where known to be within the polygon I would not query those again. Also I would do binary searches to find the edges in relation to each point. Find the edge from point A(x,y) to A'(x,0), A''(x,M)...etc for each side of the grid and the four corners. Then I would take some other points along the edges.

After I queried M times I would use 3 coins to find the polygon with the points I had and then just take the determinant of the points to find the area.
Re: Post your approach (response to post by jdmetz) | Reply
I didn't submit anything this time because of lack of time (hey, I'm finally upgrading to Linux after 15 years of being trapped in Windows!). Anyway, I want to share my unfinished approach as it has an idea I haven't seen posted by anyone yet.

In the first phase, I generate a lot (50K) of convex polygons according to the test generator (but I skip all those few polygons that would take a long time to generate). Then I search for a query point that would return false/true for ~50% of the sample polygons. After the query, all the sample polygons that are not consistent with the feedback are removed; if the number of sample polygons is lower than 1K (which happens very quickly as we're removing one half on each query) then new polygons are generated by randomly "mutating" the existing samples such that they are still concave and consistent with the feedbacks.

The second phase starts once the "diversity" of the samples drops below a certain threshold value (as measured by the std. dev. of the areas of the samples). This phase is very similar to what have been posted already: find the query point that maximizes {area discarded if false, area accepted if true}. As radeye has already pointed out, I also found that this binary search was sub-optimal, in the sense that the entropy gain was lower than 1-bit per query.

The third phase is the one that I didn't have the time to implement; it started once all queries were exhausted and would consisted in randomly generating (as the time limit allowed) valid concave polygons between the minimum convex hull and the outside points, and returning the average area as the result.

I found that the first phase was near-optimal (before diversity dropped), and much better than random sampling of points. For example, it found on average (over 1000 cases) the first "inside point" using ~2 queries, which is indeed the expected value for 1-bit entropy queries (on the other hand, random sampling used more than twice as much queries to find each of the first 3 inside points).

@jdmetz: nice idea to use a genetic algorithm to tune your estimation parameters (that's commented on the source). Do you have a generic framework for doing this, or is it an ad-hoc solution? (edit - why I've the feeling I've asked this before???)

edit - convex => concave... d'oh!
Re: Post your approach (response to post by gsais) | Reply
That's an interesting approach to finding the polygon. That's an area I probably should have concentrated on more, as I just use a pattern I thought up the first day. I fail to find the polygon at all in 22/30000 cases.

I wrote a one-off tool to optimize the parameter values. I first wrote a general multiple linear regression routine, but it was minimizing delta^2 while I wanted to minimize delta.
Re: Post your approach (response to post by jdmetz) | Reply
So I tested my approach with a change to the initial code, so that instead of picking random points until two are inside the polygon, it picks points by choosing 100 random points then taking the one that maximizes the distance to the closest already chosen point (with points put at the corners to begin with). This led to an average gain of .007 points per test case, or 3.5 points on the preliminary tests. Too bad I didn't implement this earlier :/. (Note to self: write down *all* the ideas you have when you have them).
Re: Post your approach (response to post by gsais) | Reply
As gsais I had little time (mostly after hours so I'm exhausted), and didn't finish my approach, mainly by the thousands of bugs I entered in my code. In the end I've submitted a solution similar to venco's (keep inside and outside points and choose a point that reduces the unknown area) but my choice of the point was not adecuate. After exhausting the querys I produced 500 convex polygons that fitted in the unknown area and returned the mean area. I expected this to behave much better than it finally did, so, when there was a large number of trials, I simply returned the area of one of them but constructed in a more restrictive way.

The polygons were constructed in the following way: 1) pick one main outside point, 2) look for the nearest point in the perimeter of the inside polygon and pick a point in the segment joining them 3) compute two maximal lines through it (ie if the point is in the real polygon the line segment through it will be within the two), 4) look forward and backward for other main outside points reducing the angles between the lines. 5) pick randomly one of the resulting lines and save it. Look for a main outside point not covered in the previous steps and repeat the process with it. Finally construct the proposed polygon using the picked lines and (if needed to complete it) the sides of the square.

I never finished my main idea, which was to construct a convex polygon as before in every trial, pick the edge the polygon constructed with more variability and choose the point which reduces more the variability of that edge then start again, I wonder how it would have worked, but it was too complicated.
Re: Post your approach (response to post by jdmetz) | Reply
This is the first time that I post my approach. I think it's a nice approach even if the result was bad.

1- find a point in the polygon which is not on the edge:
I find that point by starting from the center of the square and moving like a snake (in spiral) folowing the integer points until covering all the square. (I think the probability that a plygon does not contain such a point is very near to zero :) ).
When I find this first point I see the four points around it in the directions N,E,S and W with a distance of 0.5 .
Then I take the average of these points (not the centeral point)

2- find the polygon edge points in "all" directions by BS:
I start by pi/2 (four points), then pi/4 (8 points) and so on until 8192 points (which is impossible :) and I skip points that are aleady checked.
I think that my error is that I give to each new BS a determined quota of idInside calls.
after each full rotation I build my own polygon and eliminate some vertices from it by convexity (si I gain more area)

I also imroved my algorithm by imroving the binary search: if a point is not in the polygon for sure (by convexity) then try a closer point to my central point.

An other time thanks to the author of the problem, it was really a nice one.
Re: Post your approach (response to post by jdmetz) | Reply
My basic algorithm is the same as jdmetz's.
I was actually thinking this is what most people in the top 20 were doing, so I am surprised to see so many convex-hull based algorithms.

Of course my basic implementation of this only scored ~2475, so I'll briefly describe some optimizations.

1) When binary searching 'beyond' a side I would aim towards a probable vertex (the intersection of neighboring sides when viewed as lines).
2) After enough checking it becomes possible to estimate the maximum-possible convex polygon, by summing the maximum possible 'cap' outside of each side that hasn't been examined. So I modified my algorithm to return (maximum_possible + current_estimate) / 2
3) In my last submission I made a modification that would decrease the number of steps allowed in binary searches as M got smaller IF the error term described in 2) was large (>0.01).

Overall my average difference is about 0.022 over 8000 test-cases.
That is an average score of 4.978. Personally I think there is a case in the 500 systests that my solution is very lucky on, which gives me an extra ~0.5-1 points. But we will see...
Re: Post your approach (response to post by TheRaven) | Reply
That is an average score of 4.978. Personally I think there is a case in the 500 systests that my solution is very lucky on, which gives me an extra ~0.5-1 points. But we will see...

What do you mean? If your average is 4.978, then the most you can benefit from a lucky case is 0.022.
Re: Post your approach (response to post by jdmetz) | Reply
Oh, I meant 0.5-1 pts in my provisional score, not in my average. Sorry for the confusion.
Re: Post your approach (response to post by jdmetz) | Reply
First: try "random" points until two "in" points have been found. (not completely random: the first point is always (5.0,5.0) and the next few are also hand-crafted; also, once one point has been found, look closer to that point instead of anywhere in the square: take ((x,y)+(5,5))/2 instead of (x,y)).

Next: try to minimize the "uncharted area".
From known "ins" and "outs" we can construct two polygons: "known inside" and "known outside". Between them is a ring of uncharted land.
Select about 100 random points from that small ring (this is a bit tricky and relies on triangulating the uncharted land). For each of these investigate the change of unknown area if the test should return in or out. Select the random point that maximizes min("in" growth if "inside", "out" shrinking if "outside"). I assume that replacing min(x,y) with xy/(x+y) might have been slightly better, but that was a post-mortem idea.

Finally return a value between known insode area and known outside area -- not the arithmetic mean but rather an extrapolation under the assumption that the uncharted area has the same proportion in:out as the full square (i.e. return all*in/(in+(all-out)) )

Some remarks:
This contest, I had to fight a lot with awful bugs (sign errors, off by one as in ">0" vs. ">=0", getting the orientation right, mixing up entries when producing the inverse of a 2x2 det 1 matrix, etc.).
This was partly caused by my ambition to avoid rounding errors and
testing only grid points of a 65536 x 65536 grid (actually I intended to go up to (1<<20) x (1<<20)), which allowed me to make a lot of 32bit vs. 64bit errors as well. In fact, my last submission (deadline minus 4 hours) was the *first* to get everything straight under the new paradigm I tried to implement since saturday. If I had had the framework running earlier, I would have had time to implement all the stuff I had almost ready, e.g.:
  • an "opening book" for up to 10 moves (depening on situation) based on analysis of 100000 test cases I had prepared.
  • trying to fit 5,6,7,...,20 lines into the uncharted area and thus determine a good approximation of the "true" polygon
This was so frustrating - normally I rather suffer from running out of ideas mid-contest...

EDIT: After reading the full thread it looks like I was essentially on the venco trail and had the gsais idea of an opening library.
Re: Post your approach (response to post by hagman) | Reply
In fact, my last submission (deadline minus 4 hours) was the *first* to get everything straight under the new paradigm I tried to implement since saturday

At least you have achieved that, my only submission is still full of bugs, it hanges from time to time ( going round and round the square without finding one point of intersection with another line for example), or cleaning completely the inside polygon leaving only a segment as the convex hull ...
Re: Post your approach (response to post by jdmetz) | Reply
Seems like nothing new:

1) As gsais said - generate polygons and choose a point which is included only in 50% of polygons

2) As venco choose a point with maximizes the new discovered region (min of in- and out-regions).

3) Return average of two regions

I wonder why no one ever talks about how much time did it take to him for creating solutions and how their solution evolved during time - at least I'm always interested in that (or maybe I should start a new topic?).

Anyway, I'll start with that :

Almost right away, I thought about computing the "outer-region", so my first solution used random point in "unknown-region" and updated the convex hull and outer one. Since I didn't want to spent too much time, I went for java with java.awt.geom. This took 3-4 hours for 2430 score.

Then I replaced random sampling with maximizing (d1 * d2) where d1 and d2 are distances to inner and outer-regions (2448) and then replacing it with (2) for 2475 - about 6-7 hours.

The last thing was to include (1). While it decreased the score in general (maybe I have some problems with generating polygons), but it got rid off most cases where I had bad score (<4.5). for a total work of about 9 hours.

In the end the worst thing was the efficiency (especially java.awt.geom). For (1) I generated only 20 polygons at a step, and usually 6-7 steps at max. And for the (2) with random sampling I used at most 60 samples depending on the time left - quite often I didn't used all of "isInside calls", because I ran out of time.

I really thought that there will be a much more java users.

And for gsais: how did you "mutate" your polygons to match new constraints?
Re: Post your approach (response to post by Psyho) | Reply
I started with the most obvious solution to me: a grid of 10x10 points (~1650 points) and continued with the small optimization of extending the grid to sqrt(M) x sqrt(M) (~2150 points).

Because I started with a grid, I tried to fit all the solutions on a grid. I thought about going 'analogue' but I didn't have the time.

The next improvement was to explore the space going down a quad tree of increasing depth. I also tried to extrapolate unknown cells from the existing ones, implementing simple grid rules like "all the points between 2 inside points are also inside". This simple approach was quickly discovering the boundary cells and divided them recursively. ~2430 points.

After some optimizations (like trying to extend horizontal and vertical edges by doing a binary search on a parallel line, a panic routine for find very small polygons or a special bfs search made also for small polygons that searched first nearby the existing inside points) that got me to ~2445, I felt that this is a dead end and moved on.

Moved on to the grid convex-hull approach, pretty much described in my previous post.

I started from an known inside point, found 4 edge-points by bs to the left, up, down and top and then for each edge of the convex hull of the inside points (the convex hull is updated automatically on each query) I tried to expand its midpoint along the normal using of course, binary search.

This scored less than my previous quad-tree approach, until I implemented the computation of outside regions determined by an outside point and the inside convex hull. ~2460p.

But the midpoint-expansion left a lot of queries left, so I spent then by selecting unknown points, first near the convex hull vertices and next at random near the hull edges. ~2475p

The last improvement was to select at each step the point that minimized the number of unknown points left, just like venco, Psyho or hagman. I used the same function to maximize: min("in" growth if "inside", "out" shrinking if "outside"). That got me to 2484~points but also made my solution very slow because at first there are a lot of unknown points (especially on large grids) and I compute the function described above for every one, at each step. That made testing extremely slow and left little time for improvements.
Re: Post your approach (response to post by Psyho) | Reply
how did you "mutate" your polygons to match new constraints?

I mutate the polygons by moving their vertices (but the "source" polygon already satisfies the constraints). Basic pseudo-code:

while (population.size < 1000)
{
  src = pick a random polygon in population // src already satisfies the constraints
  tgt = src.clone()
  for each vertex v in tgt do // in random order
  {
    repeat up to k times
      {
      move vertex v to a random near location
      if p still satisfies the constraints then break loop
      undo move
      }
   }
   if tgt != src then add tgt to population
}
The problem with this technique is that each mutation reduces the real randomness of the total population, which is vital for estimating the probability of a point being inside or not.
Re: Post your approach (response to post by gsais) | Reply
Perhaps this mutation operator combined with a genetic-algorithm-like crossover function to "breed" two polygon representations might have helped maintain population diversity.
Re: Post your approach (response to post by Psyho) | Reply
My first approach was actually to find a side of the polygon, and then progress around the perimeter (with binary searches along each side) estimating a new angle for each side when a vertex was reached.

Unfortunately this solution suffered from many problems, so it was abandoned and never submitted...
Re: Post your approach (response to post by TheRaven) | Reply
I was going to try this approach next, but I ran out of time. My biggest fear was that, while my current solution degrades gracefully when I run out of queries, this solution would not - if I only got halfway around, I'd have a terrible estimate.
Re: Post your approach (response to post by jdmetz) | Reply
That is exactly the problem I had with it. To compensate I had to reduce accuracy of angle computations with small M which, of course ended up giving very poor estimates for such cases.
Re: Post your approach (response to post by TheRaven) | Reply
That was my first thought at a solution too. I decided the half-way around problem was a fatal flaw before I started implementing it. Since it was apparent that the near 100-sample test cases were going to decide the match. Clearly everyone was going to do well on the near 300-sample cases and the errors on those would be very near to zero.
Re: Post your approach (response to post by TheRaven) | Reply
My approach was to find a point inside the polygon.. first you have the square 10 X 10, take the point in the middle, check it, and divide in 4 new squares.. and so on until you find a point inside.

from this point with some BS. search 3 points on the perimeter of the polygon. This is a triangle. take each side, take the middle, and try to do an other triangle with BS on the height going outside of the triangle you are searching on. and so on.

I think I have a bug... because this just got: 2222.07. I did some tests.. and I putted some if's and made my score 2239.
Re: Post your approach (response to post by vlad_D) | Reply
I don't think you have a bug, I think the problem is that you don't seem to keep track of points that fall outside of the polygon, so many of your queries during the binary search could be answered without actually using up one of your measurements. If adding a point would cause your polygon to include a point you already know is outside, then you know that query will return false and can save a measurement. It's also possible that turning an edge into a triangle could affect more than just that edge (if you push it out far enough, it could cause a concavity to occur, which means you have to remove some points to keep the polygon convex), if you didn't handle this case it could lead to an underestimation of the polygon size, and some other interesting artifacts stemming from the generation of concave polygons when you might have assumed it would stay convex.
Re: Post your approach (response to post by paranoia) | Reply
for paranoia:
there will be no concavity. Because I start out with a triangle with his vertex' on the polygon sides. When I expand a side with binary search the new point will be "on" polygon side, and so on... So there will be no concavity :)
Re: Post your approach (response to post by vlad_D) | Reply
It's possible that lack of precision on earlier binary searches could lead to some minor concavities, I'd agree that the effect is probably pretty minimal. I'd guess that your biggest losses come from making unnecessary measurements as previously mentioned, and by not establishing an upper bound on area and using that to refine your final return more accurately.
Re: Post your approach (response to post by paranoia) | Reply
o yea.. thanks! :)
Re: Post your approach (response to post by vlad_D) | Reply
I ended up doing something quite similar to vlad_D.

- Find a point inside (similar fractal-like search pattern)
- Binary search in a few directions to approximately (something like +/- 0.04) find some edges and make an initial polygon (convex hull).
- Estimate the possible reduction in error available by splitting each existing edge, then split the juiciest one using binary search. Also estimate the expected reduction in error of refining the location of a "known" point on the convex hull and refine that instead of splitting if it was juicier.
- Keep splitting and refining until I run out of samples
- Each point on the convex hull is actually an interval with a sample point known to be outside and a sample point known to be inside. Using this I estimate the area using a weighted average of the inside and outside limits of each point (3/4 outside + 1/4 inside seems to work well).
- At this point I am still pretty consistently underestimating the area (by about 0.1) so, I then add the median error observed by running lots of tests. I fit a function to the number of samples for this. I probably should have fit it to the area and number of samples (subtracting any samples wasted finding the polygon at the start).

The main data structure is just an ordered vector of "points" on my estimated convex hull, where a "point" is actually two points as mentioned above.

Most of the thinking and coding went into estimating the error reduction expected by refining an edge into two, and estimating the most extreme location of the expected but unknown actual polygon vertex, using the neighboring "points" on the convex hull. The final code ended up being pretty simple. I tried some edge tracking in the last couple of days, but I couldn't get it to improve my score.

For reasons of simplicity early on I started using four initial points for the convex hull, planning on trying three later, but never got back to actually try that. I tried eight at one point when had a tricky test case with the first point way in a very acute corner of a small polygon. It fixed that case but made everything else worse so it was quickly abandoned. Not trying three initial points probably cost me.

It was clear from the beginning that with the scoring specified in the rules the performance with few sample points would totally dominate the final score. So I never put any effort into anything that took lots of samples.

Provisional score: 2484.76 (getting the next few points looked really hard, although trying starting with a triangle and calculating the median error differently would have probably helped a bit, maybe got me into the top 10, but I was burning out by then)
Re: Post your approach (response to post by Rustyoldman) | Reply
WooT! Looks like I just squeeked into the top ten anyway!

Must have been a good 4000 example cases for me, or just bad for the five people I passed in the system tests.

[TheFaxman: looks like the plan of running systests until I win is working out. Keep up the good work!]
Re: Post your approach (response to post by jdmetz) | Reply
Somewhat similar to venco.

My initial outer polygon consists of the four corners. To start, I find a point in the polygon, starting at a random (+/-0.2) point near the middle, then near the middle of the half squares...

Next I use binary search between that point and the four nearest outer points to form an initial inner polygon. Only the convex hull of the inner points was remembered after that.

Outer points hidden behind others were removed. I added extra points between the remaining outside points, which were the furthest position that the polygon could be. This gave a star shaped outer polygon, except where limited to the edge of the board.

I chose the next test point as from the centroids of triangles formed between consecutive pairs of inner points, and various points on the outer star. The point that maximised the potential gain in size of the inner polygon was selected.

The return was the squared mean root of the areas of the inner and outer star polygons, except in exceptional cases where emergency values were used.

A lot of the maths was new to me. And I had many bugs - eg my PointInTriangle() function that returned true if the triangle was a straight line! The problem seemed like it belonged in algorithm match, eg if the goal was to be accurate to 1e-9 in <2 seconds, I think we could have done it. Scary, but fun in the end.

One last thing, I wrote a little SVG library to help visualise the state of my solution. It was invaluable.
Re: Post your approach (response to post by murrayr) | Reply
I wrote an svg library too, but I didn't find a viewer that I was very happy with. In the end I used rsvg-view which comes with librsvg. What svg viewer did you use? Here's one of mine: example.svg showing inside (green) and outside (red) points, the true polygon (black), hull sides through time (green), and line segment's I'm binsearching along (gray).
Re: Post your approach (response to post by jdmetz) | Reply
I used firefox. Your svg didn't work in firefox until I changed the xmlns attribute to xmlns="http://www.w3.org/2000/svg".

Here is the final svg produced from my solution on the visualiser startup test case. Inner is green, outer (tested points only) is red, star is yellow, black is closest point between inner and star (not used in final solution), blue dot is a reference point inside the inner polygon.
Re: Post your approach (response to post by murrayr) | Reply
Thanks - I wondered why Firefox didn't display it.
Re: Post your approach (response to post by jdmetz) | Reply
I don't know if your SVG is incorrect or not, but I do know that
in order to get either FireFox or InkScape to load it I had to
change your xmlns from

xmlns="http://www.w3.org/2000/svg/"

to

xmlns="http://www.w3.org/2000/svg"
Re: Post your approach (response to post by jdmetz) | Reply
What's wrong with firefox or konqueror?
EDIT: I see you have already noted that your problem was merely due to an error in the generated svg

Btw. that image really shows well how you pin-point the polygon.
Re: Post your approach (response to post by murrayr) | Reply
One last thing, I wrote a little SVG library to help visualise the state of my solution. It was invaluable.

Yep, I did that, too, and agree completely. Seeing how the boundary of forbidden area could suddenly grow beyond the square it was formerly confined to was instructive and seeing how my triangulation method suddenly walks backwards was invaluable.
Re: Post your approach (response to post by jdmetz) | Reply
Well, I guess I have nothing new or brilliant to share, as most of the (relevant) ideas were already described, but here goes...

1) I start with a big cross and a big "X" around the middle point (5,5)
2) If I have no hits, I try smaller crosses centered on each of the 4 quadrants.
3) If I have no hits, I try shooting the center of each 1x1 square on the first/last column and row of the 10x10 initial square. This is to try and recover from VERY degenerate cases where polygons were too small, too thin or too evil ;-))
4) If I still have no hits, I return 1.99, because it seemed like a fun value, and not far from the area most of the polygons like these I found had, or the area that was sort of uncovered by my initial search.

All point-searching used a binary search, stopping when the distance between two consecutive shots were below a given threshold. I had different thresholds depending on the step I was on.

At this point, I should have something in the sorts of a bounding box, and I apply the following steps:

1) Expand the initial polygon in the 4 directions (pick the highest point, and start searching to the left and to the right, no farther than the leftmost and rightmost point... do the same in other directions)
2) Compute the convex hull (this is done many times along the way, actually, but from this point on becomes extremely relevant and necessary)
3) Given three edges 1, 2 and 3, try expanding from the middle point of edge 2 towards the intersection point of edges 1 and 3. This helped quite a bit on triangle-like polygons. (I can't quite recall why I started from the midpoint instead of just following an edge, since step 4 will do it quite similarly, but anyway...)
4) For each edge, from its midpoint, search perpendicular to it for points inside the polygon. (I've found this to work better after the above step instead of the other way around. The line intersection from above was a later addition)
5) Go to 3, and repeat until no new points inside the polygon are found, or run out of shots (M <= 0)

I've added obvious checks (is point already inside polygon? is this outside the 10x10 square? have I already shot this place, as a hit or as a miss?) and a heuristic for saving up shots: If I had 3 or more points outside the polygon within a given distance from the point I'd like to test, I automatically consider it a miss as well.

There were cases where I had left over shots, but I had no idea what to do with them :-(( Thankfully, these were rare enough for me to worry with bigger things.

I thought of computing what I called the "inverse convex hull" on the outside points. You guys called it "baloon/blown up hull/polygon" or simply a boundary/useful area. It was the smallest polygon such that it contained my polygon and no points inside it were misses (if it were, it had to be a vertex), but I had insane ideas for computing it (really inverting the hull algorithm! dear god... *shame*), and eventually gave up without being able to do it... Seems so simple when you explained it :-P

Lastly, I had an "error" function... 0.25 * (300-M)^2 is what worked best for me (since I did not have the "inverted hull"). I added this to the area of the resulting polygon and it seemed good enough, with not so good results only on weird polygons or when using a low M on an easy to find polygon (that is, with an already very accurate guess).


Someone said something about near-zero probability of not finding the polygon... Well, I have found some quite odd cases, I'd like to see the ones you guys have to share aswell... How zero-probability are these? ;-))

seed 13072
N 5
(this one inspired step 3 of the initial search)

seed 38169
N 5
(getting thinner...)

seed 2031
N 12
(...and thinner!)

seed 49095
N 5
(literally, needle in a haystack)


I really liked the idea of generating SVG/PostScripts, I missed that since the Visualization window gets a bit cluttered fast! (unless I use very small, hard to see points) I plan on looking into your codes to learn more about this :-))


This was my debut in Marathon Matches, and I liked it very much. The problem is very simple, yet challenging, and I couldn't stop thinking about it, and trying every little drop of optimization I could think of (some of which made the code worse, too!). This is addictive, and I should be careful from now on ;-))

I loved to see all of your ideas explained, way better than cracking my head open looking at source codes. Please keep them coming!! :-))


EDIT: typos... :/
Re: Post your approach (response to post by jdmetz) | Reply
I used a simple Monte Carlo method (100.0 * pointsInside / pointsTotal) with a test proxy.

The test proxy was similar to others' approaches. It formed a convex hull of all the points inside the polygon and it formed a bunch of wedges from all points outside the polygon. If the Monte Carlo program asked to test a point inside the local hull, the proxy immediately returned true. If the program asked to test a point inside one of the wedges, it returned false. Only if it had no idea did it use up one of the queries.

Later, I used max(monteCarloResults, localHullArea) to ensure I didn't waste the local information about the hull I had.

Even with a method as unoptimized as Monte Carlo, I got decent example results :

4.238, 4.864, 3.962, 4.957, 4.981, 4.788, 4.313, 4.722, 4.961, 3.961.

Unfortunately, I made a bad full submission and didn't have enough time left in the contest to resubmit.
RSS