In this article, we're going to discuss how we can represent integers in binary.
I'm going to describe a scenario that's weird to humans but that your computer actually has to do all the time.
Let's say that you have a 200 page book with exactly 50 000 words, and I ask you to tell me what the 37 420th word is. To make it easier for you, the author of the book wrote the number of words on each page at the bottom of the page, since the number of words can vary from page to page. To determine the actual word, you would have to keep a running total of the number of words you've seen so far, stop as soon as your total goes above 37 420, and then count words from the beginning of the page until you reach the word.
Now, let me make a small change.
Let's say you have a 200 page book with exactly 50 000 words and exactly 250
words on every page, and I ask you to tell me what the 37 420th
Well, the first page is going to have 250 words on it, the first two pages will
have 500 words, the first three pages will have 750 words, etc., so the 37
420th word will be the
(37 420 % 250) + 1⇒
word on page
(37 420 / 250) + 1⇒
Now, instead of having to go through more than a hundred pages and keep a total
of the number of words so far, you can now just flip to page 150 and find the
As you can see, when the pages have a fixed number of words on them, finding an arbitrary word given an index is quick and easy. Types in a lot of high performance languages will have a fixed number of bytes for the same exact reason. If we don't, the computer either won't know when the number ends or will have to waste time trying to figure out when the number ends.
By making the types have a fixed number of bytes, we limit the number of possible values we can represent. For example, if we have a type one byte long, then that type can only represent 28=256 different values.
We have to make a distinction between signed and unsigned integers before we go any further. Signed integers can represent zero, positive numbers, and negative numbers and unsigned integers can only represent zero and positive numbers.
If you want to count the number of people who visited your webpage, you only
want positive numbers since you can't get a negative number of views.
If you have a rating system that allows people to upvote and downvote posts,
then you want both positive and negative numbers for the net upvotes.
If you just want negative numbers, add a '
-' in front of an unsigned number
when you write your results.
Since we want people to use the right type for the job, we'll allow people to
use signed and unsigned versions of the
int represent signed integers and
unsigned int represent unsigned
As you already know, computers don't understand anything except
which we generally abbrieviate with
For us to represent something any numeric types, we have to work with this
We should also be able to handle arithmetic as efficently as possible, so we
should pick a good representation.
To see why picking a good representation for you number system is important, see
how long you can multiply MMMCDLXXXIII and MMCMXVIII without
To start out, we're just going to say that we can use eight bits (a.k.a. a byte) to represent a number, which means
a • b⇒
cin base ten (where
•is an arbitrary arithmetic operation), the binary representations of
cshould also satisfy
a • b⇒
a • bshould be quick to calculate, where
•is any arithmetic operation.
a > bshould be quick.
Since we have 256 possible numbers and we want to represent all the positive
numbers we can and zero, we can just let the binary representation of a number
be the number in base two and the resulting binary representations will satisfy
everything an unsigned type should satisfy, meaning we'll represent the numbers
Converting from base ten to binary is trivial, so I'm not going to rehash it
If you need a more in-depth explanation, check this
Since the algorithms for arithmetic in binary are just like the algorithms in base ten, we can just use the normal algorithms for arithmetic.
Since we can represent all the numbers from
255, we should be able to
represent the result of any arithmetic operation so long as the result is
Binary Base 10 111 1 0010 0101 37 + 0001 0011 + 19 0011 1000 56
We will have a problem, however, if the result is greater than
255 or less
For example, let's calculate
255 + 1.
1 1111 111 1111 1111 + 0000 0001 10000 0000
Since we only have eight bits we just drop the ninth bit, meaning
a + b⇒
c - b⇒
a by the basic rules of algebra,
255 + 1⇒
0 - 1⇒
In general, if you go above the highest number we can represent, you'll loop
back around to the lowest number we can represent, and, if you go below
the lowest number we can represent, you'll loop back around to the highest
number we can represent.
With signed Types, we also need to satisfy a few other constraints, such as:
-a + a⇒
0, since that's how you define the negative (a.k.a. additive inverse) of a number without changing how the computer adds numbers based on the sign.
Before we can come up with a representation for something, we need to establish what we're representing. In our case, we need to figure out what numbers we want to represent.
0 is pretty useful, so let's include it.
0 is its own inverse and we only want one binary representation for each
number, we don't need to include a "negative zero".
We'll also want
1, so we'll include
Since we want to be able to represent the negative of every number we add to the
list, we'll also have to include
Let's also add
-2 into the list,
-3 to the list, and so on
until we run out of numbers.
-127, we have a problem: we only have one more binary
If we added
-128, we would have to represent 257 different numbers,
which we can't do since we've run out of space.
0 is its own inverse, we don't want to include it twice, and we can only
have 2n binary representations, we will never be able to
completely represent a consecutive series of numbers centered around
Since it's only one number, though, we won't have to worry.
Right now, since we don't really know what to do with the last number, we're
just going to say that we definitely want to represent every integer from
We'll determine the last number we want to represent when we figure out how
we're going to represent our numbers.
As I showed earlier when I asked you to multiply two numbers in Roman numerals,
choosing a good representation for numbers will help you do math quickly and
Ideally, we would like to use the same exact algorithms for arithmetic
operations for signed and unsigned integers so that we don't have to have
separate hardware for signed and unsigned operations.
Since using the number in base two works for all the nonnegative numbers, let's
keep that system for all the nonnegative numbers, meaning
0000 0000 is
0000 0001 is
0000 0010 is
2, ..., and
0111 1111 is
We have a problem, though.
While representing something like
0101 0001 is all fine and good, how
do we represent
Remember, taking the negative of a number
a produces another number
-a + a⇒
To figure out what
-a is, we'll use the equivalent statement
First, we'll figure out the representation for
Since we want to use the same exact algorithms for arithmetic, we'll subtract
0 just like we normally would.
Doing the subtraction yields
1 01 01 01 01 01 01 01 0 - 0 0 0 0 0 0 0 1 11 1 1 1 1 1 1 1
where we drop the leftmost
1 from the result since we can't store it anywhere
and doing anything else requires us to modify our addition algorithm.
If we add the binary representations for
1, we'll get
, which overflows to
0, which is exactly the behavior we're looking for.
We now have two ways of figuring out the rest of the negative numbers: subtract
a positive number from
0 or multiplying by
Once we do the calculations, we'll find that
1101, ..., and
We could also express
-1 - a + 1, which may be easier to understand.
For example to calculate
Binary Base 10 1111 1111 -1 - 0101 0001 - 81 1010 1110 -82 + 0000 0001 + 1 1010 1111 -81
-1 - a + 1 process we came up with is equivalent to flipping the
bits (replacing zeros with ones and ones with zeros) and adding one, which
is how people normally describe taking the negative of a number in this system.
We've set up a system called two's complement, and it's the standard on the overwhelming majority on most computers since IBM released the System/360 in 1964. The only other options at the time were "one's compliment", in which you just flipped the bits to get the negative of a number, and the "sign-magnitude" format, in which you use the leftmost bit to represent the sign and you use the other bits to represent the absolute value.
One's compliment suffered from several problems, but the worst was
have two representations: the standard
0000 0000 and
Since zero could be coded using either representation, you would have to check
both representations whenever you wanted to make a comparison with zero.
The sign-magnitude format also has two representations for
0000 0000 and
1000 0000), but checking if a number is zero is pretty easy since you can just
ignore the leftmost bit entirely.
Other than checking for zero, however, the sign-magnitude format requires you to
turn addition into subtraction and vice versa and switch the order of the
numbers you're adding or subtracting depending on the sign.
1as the most significant (leftmost) bit and all non-negative numbers will have a
0as the most significant bit.
We've established that we're going to use two's complement to represent our
integers and that we are definitely going to represent the integers from
127, but we're still missing one number.
We can't have a representation we don't know how to interpret, or else our
program could fail.
To figure out how to deal with the missing number, we're going to first find out which binary representation hasn't been used yet and then we're going to use the properties of the binary representation to figure out what it should be.
We could do this by going through all the numbers from
their negatives, but that would take too long.
Plus, it's no fun.
Since we can flip all the bits and add one to any binary representation, we know
that we can negate the binary representation of the missing number.
The missing number, however, can't have any of the numbers from
as its negative since each of them already have a negative and the negative of a
negative is the original number.
Putting all this together means that when we negate the missing number, we have
to end up with the number itself.
Since the number is its own negative, adding it to itself must equal zero, and
since adding the number to itself is equivalent to multiplying by 2 (
binary) and multiplying by 2 will be easier in this case, we're going to
multiply by two.
Let's represent the missing number by
abcd efgh, where each letter represents
an unknown digit.
abcd efgh x 10 0 a bcde fgh0 abcde fgh0
abcd efgh to be its own inverse
bcde fgh0 must equal
meaning the missing number must be
a can only be a
0 or a
0000 0000 is zero,
1000 0000 must
be the representation for the missing number.
The missing number is kind of weird since it's its own negative and not zero.
If we subtract one, we will end up with
127, so maybe it should be
If we add one to it, however, we will end up with
1000 0001, which is
(to see it for yourself, negate
1000 0001 and convert from binary to base 10).
Our choices are either the number one more than
127 or the number one less
-127, which are
Note that the most significant bit (leftmost digit) is a
If we were to make it positive, it would be the only positive number that starts
As a general principle, every special case you have to check will make your
program more inefficient, and making
1000 0000 positive would be a special
case, so let's just make it negative.
Since we've decided that our missing number will be negative,
1000 0000 will
The system we set up works identically with more bits.
If we want unsigned numbers, we use the number in base two as the number's
binary representation, and we can represent numbers from 0 to
2n with n bits.
If we want signed numbers, we can represent numbers from -2n-1
to 2n-1-1 with n bits, where positive numbers get their
normal binary representations and negative numbers flip all the bits of their
corresponding positive numbers and add one.
0 is still
0000 ... 0000, 1 is still
0000 ... 0001, -1
1111 ... 1111, 2n-1-1 is still
0111 ... 1111, and
-2n-1 is still
1000 ... 0000.
We explained how computers represent integer numbers using a system called two's complement which will allow us to represent integers of any size.
If you're coming from the Making Sense of C series, the next article in
the series is Types in C, but
you should read The Memory
Hierarchy before reading about types in
C since it will provide use cases
for the different types.