Fixed point numbers
10 Jan 2015In today’s post, I’m going to go over a very old technique for working with fractional numbers. Fixed point numbers are defined by dividing an integral data type (8, 16, 32 or 64 bit) word into a whole part and fractional part.
The idea
So, what you do is you need to split your number into a whole part and fractional part. This means you dedicate some bit-space of your word to both. For a simple example, let’s work with an byte (traditioanlly 8 bits) and split the number evenly giving the whole part 4 bits as well as the fractional part.
Above, we see our number represented in binary with all 8 bits laid out. The upper 4 bits are dedicated to the whole part, the lower 4 bits to the fractional.
The whole part
In this scenario (of a 4:4) split, you’ve probably already worked out that the maximum whole value we can hold is 15. Some example numbers as they appear might be:
First thing you’ll notice is, every number is effectivly shifted 4 bits to the left. This is how you load a fixed point number into memory. You shift the original integer to the left by the number of bits you have allocated for the fractional part.
Nothing overly special here. The fractional part is a little different to think about, but still makes sense.
The fractional part
As above, we’ve dedicated 4 bits to our fractional part as well. With binary numbers, the value of the bit slots go up in powers of two; going in the opposite direction they also go down in powers of two. With normal numbers, bit slot 0 actually corresponds to 2^0. By sliding this bit slot up by four, we effectivly start counting negative powers into the fractional part:
Using this knowledge, we can start to treat the lower 4 bits as the fractional part of the number. Even bit shifting (multiplication by 2) works as you’d expect:
All we did was shift the first number by 1 slot to the left. This has the overall effect of multiplying the number by 2 - makes sense, we’re in binary after all.
Ok, let’s take a look at how some simple operations work in code.
Show your working
Working with a 4:4 fixed point number is fine for pen-and-paper type stuff, but we’ll want to scale that up once we start to work with code. We’ll target a 16 bit number now, using the lower 6 bits for fractionals so the number is now a 10:6.
First off, we’ll want to be able to load our numbers. As above, this is as simple as shifting the numbers left by the number of bits we have dedicated to the fractional portion of our number. In this case, 6.
Of course, to get our whole number back again; all we do is shift a fixed point number to the right by the same number of bits again. Getting the fractional part of the number back out, again, depends on how many bits you have dedicated to the number’s precision (or fractional part). Because we’ve dedicated 6, this means the smallest number (and therefore the most accurate precision) we can use is actually 0.015625. We only have 6 bits precision, which gives us a range of 0-63.
This formula is what you’d use to get the fractional part of our number:
So, if we had the number 0.25 which would be represented by 0000000000010000
, we can extract 0.25 from this using the formula above:
Addition and subtraction are handled just like normal. Nothing special needs to happen here. Multiplication and Division both require a little more effort.
Once you’ve performed your multiplication, you need to adjust the product by the number of fractional bits to the right.
Division requires a little more care as well. We pre-prepare the numbers by shifting them by the whole part prior to operation.
Whilst these operations are more complex than their natural analogues, you can see that there’s not that much too them.
Conclusion
All of this information is old, and that’s the point. Make sure you check out the fixed point writeup hosted by textfiles.com.