JOIN
 Select a Forum     Round Tables New Member Discussions News Discussions Algorithm Matches Marathon Matches NASA Tournament Lab TopCoder Cookbook High School Matches Sponsor Discussions Development Forums Design Forums Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings View: Flat (newest first)  | Threaded  | Tree Previous Thread  |  Next Thread Forums TopCoder Cookbook Algorithm Competitions - Rewriting Phase 1.3. Choosing Correct Numeric Data Types
 1.3. Choosing Correct Numeric Data Types | Reply This is a rewrite of this recipe and this rewrite of it.I couldn't find a simple problem which would require comparing doubles with certain tolerance (which is a must for this recipe) - any ideas?ProblemAbsolute majority of TopCoder problems involve certain calculations. In some cases an otherwise correct solution fails due to minor bugs in their way of processing numbers. How to learn to recognize such cases and avoid these bugs?SolutionAll numeric errors originate from the simple fact that our mind operates with abstract numbers, and our compilers handle machine implementations of numbers. Two main reasons of numeric errors are overflow and floating-point imprecision.Overflow happens when the result of calculations doesn't fit within the datatype it has. Overflow usually takes place in integer calculations (in doubles this is exceedingly rare), and is cured by using long type instead of common integer type.Floating-point imprecision is caused by specifics of floating point numbers representation, and happens when doubles are compared directly (without epsilon tolerance). This is cured either by avoiding doubles in calculations or by being extra careful when using and especially comparing them.DiscussionHere are some very basic tips on how to identify places where tricky numeric calculations can happen:- examine data types of input and output variables. Evident but still useful - if the method's return is of type long or double, you'll naturally have to use it at some point of calculations.- learn by heart the minimum and maximum values that can be stored in numeric data types in your language. No need to be super precise, it's enough to remember that absolute value of these limits for signed 32-bit integer is approximately 2*109, and for signed 64-bit nteger (usually referred to as long) - approximately 9*1018. Each time the intermediary results of your calculations are close to the limits, take care about overflow.- remember to keep trace of calculations you perform during your solution. Some operations require explicit casting to another data type to behave in a way you mean them to. These are multiplication which can overflow, and division which, when applied to integer numbers, produces an integer result instead of intuitive floating-point one.- note that while C++ has unsigned integer data types, Java has none. Since reference solutions to all problems are written in Java, you are guaranteed to be able to solve each problem without unsigned types. However, in some cases they are handy for storing bitmasks or using them to simplify avoiding overflow.- don't use 8-bit and 16-bit integers and 32-bit floating-point numbers, unless you're sure you need them. They won't increase processing speed, but can lead to unexpected overflow/precision errors. Use them only if memory consumption is really critical.Now, let's have a look at some examples of problems in which you have to take care of data types and the ways in which this can be done.FeudaliasBattle (multiplication overflow + integer division)This lovely problem gives one two chances to fail due to misuse of data type conversions. Let's first have a look at the correct submission (the one suggested in the editorial) in C++:```#include #include   using namespace std;   class FeudaliasBattle { public: double getMinimumTime(vector baseX, vector baseY, vector siloX, vector siloY, int takeOffTime, int rechargeTime, int missileSpeed) { int i,j,k; double silobase[4][2],dx,dy; for (i=0; i<2; i++) //index of silo for (j=0; j<2; j++) //number of recharges till this shot for (k=0; k<2; k++) //index of base { dx = baseX[k]-siloX[i]; dy = baseY[k]-siloY[i]; silobase[i+2*j][k] = takeOffTime*(j+1)/60.0 + rechargeTime*j + sqrt(dx*dx+dy*dy)/missileSpeed; } double ret=1e10; for (i=0; i<4; i++) for (j=0; j<4; j++) if (i!=j) ret 0; n-=k) { try { mf = checked(mf*n); } catch (System.OverflowException e) { return "overflow"; } } if (mf>mx) return "overflow"; return mf.ToString(); } } ```Java coders could use BigInteger class to avoid technical overflow issues completely. However, they still would have to do the check for logical overflow on each step, since operations on BigInteger are slow, and test case { 2000000000, 1 } would require about 2*109 multiplications. ```import java.math.*; public class Multifactorial { public String calcMultiFact(int ni, int ki) { BigInteger mf = BigInteger.ONE, mx = new BigInteger("1000000000000000000"); BigInteger n = new BigInteger(ni+""), k = new BigInteger(ki+""); for (; n.compareTo(BigInteger.ZERO)>0; n = n.subtract(k)) { mf = mf.multiply(n); if (mf.compareTo(mx)>0) return "overflow"; } return mf.toString(); } } ```Finally, a universal approach (demonstrated in Java) required manual check for multiplication overflow. This can be done by checking whether maximum value divided by one of multipliers is less than the other multiplier (if it is, product does overflow). Alternatively one can use doubles to store numbers larger than 263, but this might result in undesired loss of precision.```public class Multifactorial { public String calcMultiFact(int n, int k) { long mf = 1, mx = 1000000000000000000L; for (; n>0; n-=k) { if (mx/nmx) return "overflow"; return mf+""; } } ```
 Re: 1.3. Choosing Correct Numeric Data Types (response to post by Nickolas) | Reply Discussion continuedFindTriangle (floating-point imprecision)The evident approach for this problem is to use Heron's formula for calculating triangle's area. However, this is wrong: due to floating-point imprecision this formula returns non-zero area for some triangles with collinear vertices. A much safer way to calculate triangle's area is via cross-product of vectors representing its sides (see recipe "Using Vector Operations" for details). Finally, since coordinates of vertices are integer, the area will be an integer divided by two. To avoid comparing floating-point areas, we can store and compare twice-areas and divide the maximal of them by 2 only when returning it.Aircraft (floating-point imprecision + integer overflow)This is a nice example of a problem in which both types of error can occur. See recipe "Working with Lines and Segments" for analyses and solution.It is often helpful to do as many of your computations as possible in integers, switching to floating-point numbers only when it's really necessary. Thus, if you compare Euclidean distances between integer points, you can work with squared distances (which are integers) and extract square root only at the very end. Similarly, in PerforatedSheet you should calculate both total mass and weighted sum of rectangle center positions (multiplied by 2 again) in 64-bit integer, and postpone the division to the last step.See AlsoRecipe "Working with Numbers" for details on representation of integer and floating-point numbers.Recipe "Using Rational Arithmetic to Avoid Floating Point Imprecision" for another way to avoid floating-point imprecision.
 Re: 1.3. Choosing Correct Numeric Data Types (response to post by Nickolas) | Reply Some thoughts about numeric types are posted here.
 Re: 1.3. Choosing Correct Numeric Data Types (response to post by syg96) | Reply Borrowed your comment about using proper primitive data types, thanks. We'll probably remove this tip and the one about integer division from that recipe, since they are covered in this one.
 Forums TopCoder Cookbook Algorithm Competitions - Rewriting Phase 1.3. Choosing Correct Numeric Data Types Previous Thread  |  Next Thread