Floating point number: what is it? Simply explained with examples

Floating point number: what is it?

In computer science, the floating point number is often used for measuring devices that are supposed to work with a certain accuracy.

A floating point number (or "floating point number") is a representation of a number using exponential notation. In exceptional cases, this only works approximately. You can also represent the number 1230000 with the number 1.23 ⋅ 10⁶.
The 1.23 is called "Mantissa". The 10 is the "base" and the 6 is the "exponent". Incidentally, a sign can also be added to the mantissa. However, you can also apply the whole thing to the dual system. You can also represent the number 10101100 with the number 1.0101100 ⋅ 2⁷. The computer only stores the sign, mantissa and exponent.
Computers usually move the comma back and forth until there is only a 1 in front of the comma. Then the PC only has to save the decimal places of the mantissa and the exponent.
So that the exponent can be saved as a positive number, a fixed number, the so-called bias, is added. The smallest possible exponent of the place before the decimal point (- bias) is saved as 0.
In contrast to the fixed-point number, the comma is not at a fixed point in a floating-point number.

Half, Float & Double - Common coding of floating point numbers

You have certainly stumbled across these three terms, especially when programming with Arduino. These are standardized representations.

The data type "half" is a 16-bit number. The leftmost bit is responsible for the sign. The exponent has 5 bits and the mantissa 10. The bias used is 15. Since the first bit of the mantissa is (almost) always 1, this is not saved.
The "float" (or "single") data type is a 32-bit number. Here, too, a bit is used for the sign. However, the exponent has 8 bits (bias = 127) and the mantissa 23.
The data type "double" also uses a bit for the sign. Here, however, the exponent has 11 bits (bias = 1023) and the mantissa even 52 bits. In total, this is 64 bits, i.e. 8 bytes.
In addition to these three common data types, there are many more. However, these are mostly not used because the accuracy of half, float and double is already good enough.

Convert decimal numbers to floating point numbers - how it works

Finally, we would like to show you how you can convert a normal decimal number into a floating point number.

In this example we use the decimal number 18.4. The number before the decimal point is first transferred to the dual system. As a result, you should get (10010) ₂.
Then you have to convert the 0.4. First multiply the 0.4 by 2. You get 0.8 as the result. Make a note of the number before the comma. In this case it is a 0. Then multiply the 0.8 by 2. This time you get 1.6 as the result. Make a note of the 1 and continue to calculate with 0.6. After a while you will notice that the pattern repeats itself (in this example). Finally, write down all the numbers from top to bottom: 011001100110 ...
Then add the numbers together: Also add (⋅ 2⁰) so that you get 10010.01100110 ... ⋅ 2⁰. Then move the comma until there is only a 1 in front of the decimal point, and also change the power appropriately. As a result you should get 1.001001100110 ... ⋅ 2⁴, since you have moved the decimal point 4 places to the left. This step is also called "normalizing".
In this example we use the data type "float". So add the appropriate bias value to your exponent. You must also convert the result of the calculation 4 + 127 = 131 into a binary number. The number 131 is the number 10000011 in the dual system.
Now you can write down the finished floating point number. First write the bit for the sign. Since it is a positive number, the first bit is a 0. Then you have to write 131. The whole thing fits perfectly in this case, since this number requires 8 bits and 8 bits are available for a float. Finally, you have to write down the first 23 bits of the mantissa, since the mantissa has 23 bits available for a float.
Your finished floating point number should therefore be the number 01000001100100110011001100110011. A little clearer is the number 0 | 10000011 | 00100110011001100110011.

Convert floating point number to decimal number - Here's how

Finally, we would like to show you how you can convert a floating point number into a decimal number again. For this we take the number 1000001100100110011001100110011.

First fill the number (front) with zeros until you get a 16, 32 or 64 bit number. In this case it is 01000001100100110011001100110011.
The first digit stands for the sign. So our number is positive.
Then write the next (in this case) 8 digits and subtract the bias. (10000011) ₂ = 131 → 131 - 127 = 4 → So there is "⋅ 2⁴" at the back.
Now write a "1, " and then all the remaining numbers, as well as the "⋅ 2⁴": 1.00100110011001100110011 ⋅ 2⋅
Then move the comma 4 places to the right so that you can omit the "⋅ 2⁴": 10010.0110011001100110011
Next, convert 10010 to an integer as usual. You get 18.
Now you have to convert the decimal places. The first digit after the comma has the value 1: 2¹, the second digit 1: 2² and so on. Add the values, and the number before the comma gives you the number 18.3999996185302734375.

In the next practical tip we will show you how you can convert ASCII letters to binary numbers.

Floating point number: what is it? Simply explained with examples

Related Videos: Floating Point Numbers - Computerphile (May 2024).

You May Also Like...

Category

Hot Posts