In this article, we're going to discuss how we can represent integers in binary.
I'm going to describe a scenario that's weird to humans but that your computer actually has to do all the time.
Let's say that you have a 200 page book with exactly 50 000 words, and I ask you to tell me what the 37 420th word is. To make it easier for you, the author of the book wrote the number of words on each page at the bottom of the page, since the number of words can vary from page to page. To determine the actual word, you would have to keep a running total of the number of words you've seen so far, stop as soon as your total goes above 37 420, and then count words from the beginning of the page until you reach the word.
Now, let me make a small change.
Let's say you have a 200 page book with exactly 50 000 words and exactly 250
words on every page, and I ask you to tell me what the 37 420th
word is.
Well, the first page is going to have 250 words on it, the first two pages will
have 500 words, the first three pages will have 750 words, etc., so the 37
420th word will be the (37 420 % 250) + 1
⇒171
th
word on page (37 420 / 250) + 1
⇒150
.
Now, instead of having to go through more than a hundred pages and keep a total
of the number of words so far, you can now just flip to page 150 and find the
171th word.
As you can see, when the pages have a fixed number of words on them, finding an arbitrary word given an index is quick and easy. Types in a lot of high performance languages will have a fixed number of bytes for the same exact reason. If we don't, the computer either won't know when the number ends or will have to waste time trying to figure out when the number ends.
By making the types have a fixed number of bytes, we limit the number of possible values we can represent. For example, if we have a type one byte long, then that type can only represent 28=256 different values.
We have to make a distinction between signed and unsigned integers before we go any further. Signed integers can represent zero, positive numbers, and negative numbers and unsigned integers can only represent zero and positive numbers.
If you want to count the number of people who visited your webpage, you only
want positive numbers since you can't get a negative number of views.
If you have a rating system that allows people to upvote and downvote posts,
then you want both positive and negative numbers for the net upvotes.
If you just want negative numbers, add a '-
' in front of an unsigned number
when you write your results.
Since we want people to use the right type for the job, we'll allow people to
use signed and unsigned versions of the int
.
We'll let int
represent signed integers and unsigned int
represent unsigned
integers.
As you already know, computers don't understand anything except on
and off
,
which we generally abbrieviate with 1
and 0
.
For us to represent something any numeric types, we have to work with this
constraint.
We should also be able to handle arithmetic as efficently as possible, so we
should pick a good representation.
To see why picking a good representation for you number system is important, see
how long you can multiply MMMCDLXXXIII and MMCMXVIII without
giving up.
To start out, we're just going to say that we can use eight bits (a.k.a. a byte) to represent a number, which means
a • b
⇒c
in base ten (where •
is an arbitrary arithmetic
operation), the binary representations of a
, b
, and c
should also satisfy
a • b
⇒c
.
a • b
should be quick to calculate, where •
is any arithmetic operation.
a > b
should be quick.
Since we have 256 possible numbers and we want to represent all the positive
numbers we can and zero, we can just let the binary representation of a number
be the number in base two and the resulting binary representations will satisfy
everything an unsigned type should satisfy, meaning we'll represent the numbers
from 0
to 255
.
Converting from base ten to binary is trivial, so I'm not going to rehash it
here.
If you need a more in-depth explanation, check this
article.
Since the algorithms for arithmetic in binary are just like the algorithms in base ten, we can just use the normal algorithms for arithmetic.
Since we can represent all the numbers from 0
to 255
, we should be able to
represent the result of any arithmetic operation so long as the result is
between 0
and 255
.
For example:
Binary
Base 10
111
1
0010 0101
37
+ 0001
0011
+ 19
0011 1000
56
We will have a problem, however, if the result is greater than 255
or less
than 0
.
For example, let's calculate 255 + 1
.
1 1111 111
1111 1111
+ 0000 0001
1 0000 0000
Since we only have eight bits we just drop the ninth bit, meaning 255 +
1
⇒0
.
Since a + b
⇒c
means c - b
⇒a
by the basic rules of algebra,
255 + 1
⇒0
means 0 - 1
⇒255
.
In general, if you go above the highest number we can represent, you'll loop
back around to the lowest number we can represent, and, if you go below
the lowest number we can represent, you'll loop back around to the highest
number we can represent.
With signed Types, we also need to satisfy a few other constraints, such as:
-a + a
⇒0
, since that's how you define the negative (a.k.a. additive
inverse) of a number without changing how the computer adds numbers based on the
sign.
Before we can come up with a representation for something, we need to establish what we're representing. In our case, we need to figure out what numbers we want to represent.
We want 0
since 0
is pretty useful, so let's include it.
Since 0
is its own inverse and we only want one binary representation for each
number, we don't need to include a "negative zero".
We'll also want 1
, so we'll include 1
.
Since we want to be able to represent the negative of every number we add to the
list, we'll also have to include -1
.
Let's also add 2
and -2
into the list, 3
and -3
to the list, and so on
until we run out of numbers.
After adding 127
and -127
, we have a problem: we only have one more binary
representation available.
If we added 128
and -128
, we would have to represent 257 different numbers,
which we can't do since we've run out of space.
Since 0
is its own inverse, we don't want to include it twice, and we can only
have 2n binary representations, we will never be able to
completely represent a consecutive series of numbers centered around 0
.
Since it's only one number, though, we won't have to worry.
Right now, since we don't really know what to do with the last number, we're
just going to say that we definitely want to represent every integer from -127
to 127
.
We'll determine the last number we want to represent when we figure out how
we're going to represent our numbers.
As I showed earlier when I asked you to multiply two numbers in Roman numerals,
choosing a good representation for numbers will help you do math quickly and
effectively.
Ideally, we would like to use the same exact algorithms for arithmetic
operations for signed and unsigned integers so that we don't have to have
separate hardware for signed and unsigned operations.
Since using the number in base two works for all the nonnegative numbers, let's
keep that system for all the nonnegative numbers, meaning 0000 0000
is 0
,
0000 0001
is 1
, 0000 0010
is 2
, ..., and 0111 1111
is 127
.
We have a problem, though.
While representing something like 81
as 0101 0001
is all fine and good, how
do we represent -81
?
Remember, taking the negative of a number a
produces another number -a
such
that -a + a
⇒0
.
To figure out what -a
is, we'll use the equivalent statement 0 -
a
⇒-a
.
First, we'll figure out the representation for -1
.
Since we want to use the same exact algorithms for arithmetic, we'll subtract
1
from 0
just like we normally would.
Doing the subtraction yields
10101010 10101010
- 0 0 0 0 0 0 0 1
1 1 1 1 1 1 1 1 1
where we drop the leftmost 1
from the result since we can't store it anywhere
and doing anything else requires us to modify our addition algorithm.
If we add the binary representations for -1
and 1
, we'll get
, which overflows to 1 0000
00000
, which is exactly the behavior we're looking for.
We now have two ways of figuring out the rest of the negative numbers: subtract
a positive number from 0
or multiplying by -1
.
Once we do the calculations, we'll find that -2
is 1111 1110
, -3
is 1111
1101
, ..., and -127
is 1000 0001
.
We could also express -a
as -1 - a + 1
, which may be easier to understand.
For example to calculate -81
:
Binary
Base 10
1111 1111
-1
- 0101 0001
- 81
1010 1110
-82
+ 0000 0001
+ 1
1010 1111
-81
The new -1 - a + 1
process we came up with is equivalent to flipping the
bits (replacing zeros with ones and ones with zeros) and adding one, which
is how people normally describe taking the negative of a number in this system.
We've set up a system called two's complement, and it's the standard on the overwhelming majority on most computers since IBM released the System/360 in 1964. The only other options at the time were "one's compliment", in which you just flipped the bits to get the negative of a number, and the "sign-magnitude" format, in which you use the leftmost bit to represent the sign and you use the other bits to represent the absolute value.
One's compliment suffered from several problems, but the worst was 0
would
have two representations: the standard 0000 0000
and 1111 1111
.
Since zero could be coded using either representation, you would have to check
both representations whenever you wanted to make a comparison with zero.
The sign-magnitude format also has two representations for 0
(0000 0000
and
1000 0000
), but checking if a number is zero is pretty easy since you can just
ignore the leftmost bit entirely.
Other than checking for zero, however, the sign-magnitude format requires you to
turn addition into subtraction and vice versa and switch the order of the
numbers you're adding or subtracting depending on the sign.
1
as the most significant (leftmost) bit and
all non-negative numbers will have a 0
as the most significant bit.
We've established that we're going to use two's complement to represent our
integers and that we are definitely going to represent the integers from -127
to 127
, but we're still missing one number.
We can't have a representation we don't know how to interpret, or else our
program could fail.
To figure out how to deal with the missing number, we're going to first find out which binary representation hasn't been used yet and then we're going to use the properties of the binary representation to figure out what it should be.
We could do this by going through all the numbers from 0
to 127
, finding
their negatives, but that would take too long.
Plus, it's no fun.
Since we can flip all the bits and add one to any binary representation, we know
that we can negate the binary representation of the missing number.
The missing number, however, can't have any of the numbers from -127
to 127
as its negative since each of them already have a negative and the negative of a
negative is the original number.
Putting all this together means that when we negate the missing number, we have
to end up with the number itself.
Since the number is its own negative, adding it to itself must equal zero, and
since adding the number to itself is equivalent to multiplying by 2 (10
in
binary) and multiplying by 2 will be easier in this case, we're going to
multiply by two.
Let's represent the missing number by abcd efgh
, where each letter represents
an unknown digit.
abcd efgh
x 10
0
a bcde fgh0
a bcde fgh0
For abcd efgh
to be its own inverse bcde fgh0
must equal 0000 0000
,
meaning the missing number must be a000 0000
.
Since a
can only be a 0
or a 1
and 0000 0000
is zero, 1000 0000
must
be the representation for the missing number.
1000 0000
Represent?The missing number is kind of weird since it's its own negative and not zero.
If we subtract one, we will end up with 127
, so maybe it should be 127
.
If we add one to it, however, we will end up with 1000 0001
, which is -127
(to see it for yourself, negate 1000 0001
and convert from binary to base 10).
Our choices are either the number one more than 127
or the number one less
than -127
, which are 128
and -128
.
Note that the most significant bit (leftmost digit) is a 1
.
If we were to make it positive, it would be the only positive number that starts
with a 1
.
As a general principle, every special case you have to check will make your
program more inefficient, and making 1000 0000
positive would be a special
case, so let's just make it negative.
Since we've decided that our missing number will be negative, 1000 0000
will
represent -128
.
The system we set up works identically with more bits.
If we want unsigned numbers, we use the number in base two as the number's
binary representation, and we can represent numbers from 0 to
2n with n bits.
If we want signed numbers, we can represent numbers from -2n-1
to 2n-1-1 with n bits, where positive numbers get their
normal binary representations and negative numbers flip all the bits of their
corresponding positive numbers and add one.
0 is still 0000 ... 0000
, 1 is still 0000 ... 0001
, -1
is still 1111 ... 1111
, 2n-1-1 is still 0111 ... 1111
, and
-2n-1 is still 1000 ... 0000
.
We explained how computers represent integer numbers using a system called two's complement which will allow us to represent integers of any size.
If you're coming from the Making Sense of C series, the next article in
the series is Types in C, but
you should read The Memory
Hierarchy before reading about types in C
since it will provide use cases
for the different types.