You probably already know that a floating point number is represented in
binary as `sign * significand * radixexponent`. Thus you can represent
the number 20.23 with radix 10 (i.e. base 10) as `1 * .2023 * 102` or
as `1 * .02023 * 103`.

Since one number can be represented in different ways, we define we
defined the *normalised* version as the one that satisfies
`` 1/radix <= significand < 1``. You can read that as saying "the
leftmost number in the significand should not be zero".

So when we convert into binary (base 2) rather than base 10, we are saying that the "leftmost number should not be zero", hence, it can only be one. In fact, the IEEE standard "hides" the 1 because it is implied by a normalised number, giving you an extra bit for more precision in the significand.

So to normalise a floating point number you have to shift the signifcand left a number of times, and check if the first digit is a one. This is something that the hardware can probably do very fast, since it has to do it a lot. Combine this with an architecture like IA64 which has a 64 bit significand, and you've just found a way to do a really cool implementation of "find the first bit that is not zero in a 64 bit value", a common operation when working with bitfields (it was really David Mosberger who originally came up with that idea in the kernel).

#define ia64_getf_exp(x) \ ({ \ long ia64_intri_res; \ \ asm ("getf.exp %0=%1" : "=r"(ia64_intri_res) : "f"(x)); \ \ ia64_intri_res; \ }) int main(void) { long double d = 0x1UL; long exp; exp = ia64_getf_exp(d); printf("The first non-zero bit is bit %d\n", exp - 65535); }

Note the processor is using an 82 bit floating point implementation, with a 17 bit exponent component. Thus we use a 16 bit (0xFFFF, or 65535) bias so we can represent positive and negative numbers (i.e, zero is represented by 65535, 1 by 65536 and -1 by 65534) without an explicit sign bit.

IA64 uses the floating point registers in other interesting ways too.
For example, the `clear_page()` implementation in the kernel spills
zero'd floating point registers into memory because that provides you
with the maximum memory bandwidth. The libc `bzero()` implementation
does a similar thing.