The IEEE 754-2008 specifies five rounding functions:

- Round toward \(-\infty\) (RD): It is the largest floating point number less than or equal to \(x\).
- Round toward \(\infty\) (RU): It is the smallest floating point number greater than or equal to \(x\).
- Round toward zero (RZ): It is the closest floating point number whose absolute value is no greater than that of \(x\).
- Round ties to even (RN): When \(x\) falls exactly halfway between two consecutive floating point numbers, pick the one that is even. Otherwise round to the nearest.
- Round ties to away (RN): When \(x\) falls exactly halfway between two consecutive floating point numbers, pick the one that is of greater magnitude. Otherwise round to the nearest.

Round ties to even is the default rounding in IEEE 754-2008.

A result of a function is called **correctly rounded** if the function
was first computed with infinite precision and unlimited range, then
rounded using one of the functions.

For \(\beta=2\), with precision \(p\), and a normal \(x\)
(i.e. \(x\ge2^{e_{\textit{min}}}\)), let the infinitely precise
significand be \(1.m_{1}m_{2}m_{3}\dots\). Then define the **round
bit** to be \(m_{p}\) and the **sticky bit** to be the bitwise OR of
\(m_{p+1}\) onwards.

How one would round is shown in the table below:

round | sticky | RD | RU | RN |
---|---|---|---|---|

0 | 0 | |||

0 | 1 | |||

1 | 0 | -/+ | ||

1 | 1 |

A \(-\) means that the significand is merely truncated.

A \(+\) means that you truncate, and then add \(2^{-p+1}\) to the result.

A \(-/+\) means it is the halfway case.

RD, RU and RZ are called the **direct rounding modes**.

A **rounding breakpoint** is the value where the rounding function
changes value. For RD, RU and RZ, the breakpoints are floating point
numbers. For RN, they are the halfway points in between floating point numbers.

## Useful Properties

All the rounding functions are monotonically increasing functions.

## Handling Denormals and Large Values

Let \(\alpha=\beta^{e_{\textit{mind}}-p+1}\) - the smallest denormal number.

Let \(\Omega=(\beta-\beta^{1-p})\beta^{e_{\textit{max}}}\) be the largest floating point number.

Then:

- RN for even is 0 if \(0<x\le\alpha/2\)
- RN for even is \(+\infty\) if \(x\ge(\beta-\beta^{1-p}/2)\beta^{e_{\textit{max}}}\)
- RN for away is 0 if \(0<x<\alpha/2\)
- RN for away is \(+\infty\) if \(x\ge(\beta-\beta^{1-p}/2)\beta^{e_{\textit{max}}}\)
- RD is 0 if \(0<x<\alpha\)
- RD is \(\Omega\) if \(x\ge\Omega\)
- RU is \(\alpha\) if \(0<x\le\alpha\)
- RU is \(+\infty\) if \(x>\Omega\)