Rounding Modes

Category:  Floating Point Arithmetic → IEEE-754 → Basics

Four rounding modes are discussed with examples.

There are 4 main rounding modes in IEEE-754 format for floating point numbers. The first three hardly require any special explanations, so, I assume, a couple of examples in the decimal system will be enough. The last rounding mode will be discussed in a more detailed way, it is the one used by default.

So, here are four rounding modes.

• Towards zero. For example, 4.9≈4; −4.9≈−4.
• Towards minus infinity. For example, 4.9≈4; −4.9≈−5.
• Towards plus infinity. For example, 4.9≈5; −4.9≈−4.
• “Half to even”.

The last rounding mode requires further explanation. In this mode the numbers are rounded to the nearest number that can be represented exactly, but in the case when the number is exactly half-way between two nearest ones, we select the number, the mantissa of which is “even”, i. e. its least significant bit is zero.

In our example, where $p=3$, it is impossible to represent the number 4.5, since it would require 4 bits in the mantissa $$\require{color} 4.5=1.00\colorbox{gray}{1} \times 2^2.$$

The last bit, marked in gray, does not fit in the mantissa, therefore the number should be rounded. We see that number 4.5 is equally close to the bottom nearest (4=1.00×22) and the top nearest (5=1.01×22), so there is an ambiguity. In such cases we should make a choice in favor of even least significant bit. That is, the benefit of 4=1.00×22. Another example is the number 5.5: $$\require{color} 5.5=1.01\colorbox{gray}{1} \times 2^2.$$

In this example we round up, as here the last bit is zero: 6=1.10×22.

Here are a few other examples. Try to analyze and understand them on your own.

• $\require{color} 4{,}53125=1{,}00\colorbox{gray}{1}0001\times2^2\approx 5.$
• $\require{color} 5{,}4375=1{,}01\colorbox{gray}{0}111\times2^2\approx 5.$
• $\require{color} 3{,}75=1{,}11\colorbox{gray}{1}\times2^1\approx 4.$
• $\require{color} 3{,}625=1{,}11\colorbox{gray}{0}1\times2^1\approx 3{,}5.$
• $\require{color} 1{,}375=1{,}01\colorbox{gray}{1}\times2^0\approx 1{,}5.$
• $\require{color} 1{,}374=1{,}01\colorbox{gray}{0}111\ldots\times2^0\approx 1{,}25.$