JOIN
Get Time
forums   
Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings
View: Flat (newest first)  | Threaded  | Tree
Previous Thread  |  Next Thread
Green Zone/Red Zone | Reply
Re: Green Zone/Red Zone (response to post by Ryan) | Reply
Ok, now before the next operation of rounding, lets talk a bit more about what the green zone, red zone theory means about the philosophy of using floating point numbers. Lets say do a calculation and mathematically the end result should be exactly 1. That's all fine and dandy, but in the real world we'll end up with a number with a green zone and a red zone. Lets assume that the green zone has 7 digits. So the question is, what actual value can the stored data have? Well, the answer to the question is easy, because we can just put the highest and lowest values for the red zone:
1.0000000999 to 0.9999999000
Namely, the stored number can never be more that 1e-7 away from the value of 1. In fact this value, 1e-7 is quite special. Special enough to give it a name, epsilon (the name comes from mathematicians practice of calling a small positive number by the same greek letter). Also note that the only thing special about the actual number "1e-7" is that in this case the green zone happens to be 7 digits, if your green zone has a different size, then your epsilon should too (TopCoder usually wants you to have an epsilon of 1e-9). Why is the epsilon important? Because we know from the green zone/red zone arguments the validity of this important fact:

true value - epsilon < stored value < true value + epsilon

This is just a restatement of what the possible stored values are. But really if you think about it, that question isn't the one we usually want to ask. In fact we know the stored value; just check the contents of the variable you're using. The question is, what is the true value? If you fiddle around with the previous inequalities and try to solve them for the true value, then you'll end up with this similar fact:

stored value value - epsilon < true value < stored value + epsilon

Now there are two notable things about this equation. The first is that the true value is never epsilon away from the stored value. This should be obvious because the stored value should be less than a small amount away from the true value. But the second thing is more important. If we know the stored value then we know a range of values that the true value lies in. We don't know the true value exactly, but we have the set of possible values it could be. This is the whole reason for using epsilon in floating point comparisions, because if the ranges of two variables overlap, then the two values could be equal.

Ok, now that that's through, back to the point of rounding. Let's say we want to round the variable x to the nearest integer less than or equal to x. For a simple first look, assume that x contains the exact value (before you realized that floating point computations were exact you assumed this exact same thing, so think of this as your code before learning the magic that is floating point). Well, you say, this is a simple task. This is just truncation, which is what casting to an integer does. (int) x is the desired value! Well, this is not quite so. Why? look above. If the true value of x is exactly 1 then x could be as low as 0.9999999000. When this number gets truncated it's going to go straight down to 0, be dead wrong, and get you that exact score in the SRM. The cure is plain and simple, just add epsilon to x before you do the cast. This way if x is anywhere near an actual integer(and so we must assume that the true value of x is that integer), then it will be gauranteed to get rounded to the right place. On the other hand if x is not close to any integer, then adding epsilon won't change the result.
Re: Green Zone/Red Zone (response to post by Ryan) | Reply
Thank you for the great explanation Ryan! So if you round down you should do something like (int)(x+1e-9). If you round up you do (int)(x-1e-9). Is this correct?

How do you know what epsilon will be? Consider my previous example (see other post). In 3^35, how do you know that the answer will be off by 3?

This made me think a bit more. So green zone means the digit is 100% correct, red means there is 50% its wrong. I guess yellow means that there is 25% its wrong. I wonder if there is a hardware architecture that lets you extract this information quickly? Ie. I want a function like accuracy(digit) that tells me what the % accuracy that digit has.
Re: Green Zone/Red Zone (response to post by dimkadimon) | Reply
Well, casting something to an int rounds it down, so if you're doing that, you have to guard against rounding a 1.999999999999 that's really a 2-epsilon to 1 instead. So you'd use (int)(x+1e-9). The time you'd either add or subtract is if you wanted to check equality - if you wanted something like "if (x == y)", it's safer to say "if (x+1e-9 > y && x-1e-9 < y)" or "if (abs(x-y) < 1e-9)". And it's even safer to do relative error like I think misof explained in the article.

The reason 3^35 is off likely has to do with storage space. 3^35 is 50031545098999707, 17 digits in decimal, or 56 digits in binary. The mantissa of a double is 52 binary digits (53 counting the implied 1), which is 3 digits too few. What that means is that even if exponentation is one operation, and the resulting answer is the absolute closest a double can be to the right answer, the last 3 binary digits are going to be cut off, so the answer will be wrong unless the last 3 digits are 0 in the real answer (which they aren't, because powers of 3 aren't going to be multiples of 8). And this is assuming you have one red digit like the starting value of PI that Ryan uses as an example.

I'm not sure green and yellow are importantly different - even a green digit could potentially be wrong if the answer is close to certain 'boundaries', which is why we look at answers we get like 1.99999999998 for problems and understand that's the same as the correct answer of 2.0 (even though all the 'green digits' are wrong, only the last 2 digits are likely real 'red digits'). I'm not sure if there is a computing architecture which remembers how many signficant digits it has, but it sounds possible in some convention or another.
Re: Green Zone/Red Zone (response to post by Kawigi) | Reply
(53 counting the implied 1)


Thanks for putting that bit in. I'd forgotten about the implied 1 and had been trying to figure out how 335 could have been only 3 off. Now I know.
Re: Green Zone/Red Zone (response to post by Kawigi) | Reply
Sorry I got that wrong. Should rounding up be (int)(x+1+1e-9)? Rounding to the nearest integer - (int)(x+0.5+1e-9)?
Re: Green Zone/Red Zone (response to post by dimkadimon) | Reply
rounding to nearest is correct (if you want always round .5 up semantics).
Re: Green Zone/Red Zone (response to post by Ryan) | Reply
Are you sure that's correct? What if the number you're trying to round up is exactly 2, then rounding it up should still give 2 since no rounding is necessary. But (int)(x + 1 + 1e-9) would give 3 because you've added more than 1 to it and then rounded it down.
Re: Green Zone/Red Zone (response to post by aussie) | Reply
Hmmm... maybe it should be (int)(x+1-1e-9)

(int)(1.999... +1-1e-9) gives 2
(int)(2.000...1 +1-1e-9) gives 2. Or does it? I am confused!
Re: Green Zone/Red Zone (response to post by aussie) | Reply
Yeah, you're right. I guess I should add 9:15 PM to the list of times when I shouldn't post. The correct form of the ceiling operation is:
(int)(x+1-epsilon)
Re: Green Zone/Red Zone (response to post by Kawigi) | Reply
> I'm not sure if there is a computing architecture which remembers how many signficant digits it has, but it sounds possible in some convention or another.

Some CPUs (Intel and compatibles) will signal precision loss, e.g when subtracting two almost equal floats (where the result is almost complete garbage). But that info is typically not (or not easily) available to high level programming.
There you may implement arithmetic on intervals if You want to keep track of actual error.
RSS