How to store an integer in the Memory?
Before learning how to store an integer in memory, we first review some information about the integer. In modern computer systems, the integer usually has three kinds of length: int
, short int
, and long int
. In the 64 bits system, the int
generally occupies 4 bytes memory space, and the short int
needs 2 bytes, the long int
needs 8 bytes in Linux/Mac or 4 bytes in Windows. If the integer has a sign, the highest bit of the integer binary is the sign bit. The sign bit is one means the integer is negative, zero means positive.
How to store an Integer in Memory?
The simplest way to store an integer is to save the binary of the integer, but in modern computer systems, they save the integers without this. The reason is the subtraction of integers. From our point of view, subtraction is similar to addition. If you know how to calculate addition, you must know how to calculate a subtraction. But in computer systems and hardware, the addition and the subtraction are two distinct operations, so the engineers need to design two different circuits to implement them, which is very complex.
The simplest is the best. Smart scientists begin to think about how to merge the subtraction and the addition to one operation. Finally, they implemented a new storing format for integers, and there are three concepts we should know:
- Original code
- Reverse code
- Complement code
Original code
Convert an integer to binary format, and this is the original code. e.g.:
short a = 6
, theoriginal code
ofa
is0000 0000 0000 0110
;short b = -18
theoriginal code
ofb
is1000 0000 0001 0010
(the highest bit of the b is one means the b is negative)
Reverse code
The reverse code has some differences between negative and positive. For positive, the reverse code is equal to its original code. For negative, the reverse code is to reverse all bits of the original code except the sign bit (convert 1 to 0, 0 to 1).
short a = 6
- original code:
0000 0000 0000 0110
- reverse code:
0000 0000 0000 0110
- original code:
short b = -18
- original code:
1000 0000 0001 0010
- reverse code:
1111 1111 1110 1101
- original code:
Complement code
For positive, the complement code is equal to the reverse code and original code. For negative, the complete code has a minor modification, adding one to the reverse code.
short a = 6
- original code:- - -
0000 0000 0000 0110
- reverse code:- - - -
0000 0000 0000 0110
- complement code:- -
0000 0000 0000 0110
- original code:- - -
short b = -18
- original code:- - -
1000 0000 0001 0010
- reverse code:- - - -
1111 1111 1110 1101
reverse all bits except sign bit - complement code:- -
1111 1111 1110 1110
reverse code plus 1
- original code:- - -
At present, the computer systems store integers with the complement code
format. When reading the integers, we need to reversely convert the complement code
to the reverse code
and then to the original code
How does the complement code help computers to execute the subtraction?
We are ready to execute the expression 6 - 18
, the 6 - 18
is equal to the 6 + (-18)
.
If we add the original codes of the 6
and -18
directly, can we get a correct answer?
- =
6 + (-18)
- =
0000 0000 0000 0110
original +1000 0000 0001 0010
original - =
1000 0000 0001 1000
original - =
-24
If we make the sign bit join the calculation, we can only get an error answer.
If we add the reverse codes of the 6
and -18
, what’s will happen?
- =
6 + (-18)
- =
0000 0000 0000 0110
reverse +1111 1111 1110 1101
reverse - =
1111 1111 1111 0011
reverse - =
1000 0000 0000 1100
original - =
-12
The answer -12
is correct, can we calculate the correct answer just depends on reverse codes? Let’s see another example: 18 - 6
:
- =
18 + (-6)
- =
0000 0000 0001 0010
reverse +1111 1111 1111 1001
reverse - =
1 0000 0000 0000 1011
reverse - =
0000 0000 0000 1011
reverse - =
0000 0000 0000 1011
original - =
11
The correct answer is 12
, but the calculation result is 11
. It’s one less than the correct answer. The result of a small number minus a large number is right, is the result one less than the correct answer whenever a number minus a smaller one?
You can inspect it by yourself. My answer is YES. So we need to figure out a way to add one to the result when calculating a large number minus a small one. Now, it’s time to introduce the complement code
, a genius-like idea.
The genius-like idea complement code
The complement code is the reverse code plus one. If we calculate a small number minus a large number, we will plus one when converting the reverse code to the complement code. We know the answer is negative, so we need to convert the complement code to the reverse code reversely. The result will be minus one at this time. Finally, the answer won’t have any changes.
If we calculate a large number minus a small number, we know the result is one less than the correct answer when calculating them with the reverse codes. If we use the complement code, because the complement code is reverse code plus one, the answer is positive that we don’t need to revert it again. Finally, the result is equal to the correct answer.
The complement code is a genius-like design significantly reducing the complexity of the circuit
The value range of integers
The short
, int
, and long
are the common integer types in C. They can only store a limited-length integer. If the integer is too long, the over part would be cut, the final value saved would be an error. We say this situation as overflow
.
The value range of unsigned integers
Easily to calculate, we make an example with short int
. The short int
occupies a byte, eight bits, to store an integer, setting all bits to 1 is the max value, setting all bits to 0 is the min value. The max value 1111 1111
equals 2^8 - 1 = 255
, and we use a small trick to calculate the max value fleetly. The value of 1111 1111
is not easy to get, so we can add 1 to it to get the result 1 0000 0000
and then minus 1.
bytes | min value | max value | |
---|---|---|---|
unsigned char | 1 byte | 0 |
2^8 - 1 = 255 |
unsigned short | 2 bytes | 0 |
2^16 - 1 = 65,535 ≈ 65 thousand |
unsigned int | 4 bytes | 0 |
2^32 - 1 = 4,294,967,295 ≈ 4.2 billion |
unsigned long | 8 bytes | 0 |
2^64 - 1 ≈ 1.84*10^19 |
The value range of signed integers
complement code | reverse code | original code | value |
---|---|---|---|
1111 1111 | 1111 1110 | 1000 0001 | -1 |
1111 1110 | 1111 1101 | 1000 0010 | -2 |
1111 1101 | 1111 1100 | 1000 0011 | -3 |
… | … | … | … |
1000 0011 | 1000 0010 | 1111 1101 | -125 |
1000 0010 | 1000 0001 | 1111 1110 | -126 |
1000 0001 | 1000 0000 | 1111 1111 | -127 |
1000 0000 |
– | – | -128 |
0111 1111 | 0111 1111 | 0111 1111 | 127 |
0111 1110 | 0111 1110 | 0111 1110 | 126 |
0111 1101 | 0111 1101 | 0111 1101 | 125 |
… | … | … | … |
0000 0010 | 0000 0010 | 0000 0010 | 2 |
0000 0001 | 0000 0001 | 0000 0001 | 1 |
0000 0000 | 0000 0000 | 0000 0000 | 0 |
Also, use the short int
as an example. The singed integer is stored with the complement code format in the memory. The complement codes are from 0000 0000
to 1111 1111
, in the period from 0000 0000
to 0111 1111
the values are positive (0 to 127), in the period from 1000 0001
to 1111 1111
the values are negative (-127 to -1).
You might find that there is no code 1000 0000
because the highest bit of the complement code is 1, so it’s a negative value, and we should minus one from it to transform it to the reverse code. But all bits of it are zero, so the code has to borrow one to the highest bit. The highest bit is a signed bit that can not be changed. Now we know the complement code 1000 0000
can not be converted to an integer, how do we deal with this value? Discarding the value is too wasteful. People specific the value as the number -128
.
Value overflow
The integer types char
, short
, int
, and long
have limited length. The over bits would be discarded when you assign a very large value. When occurring overflow, as some highest bits are ignored, the result will be very strange.
Let’s see an example:
1 |
|
The variable a
is an unsigned int
, so its length is 4 bytes, the max value of it is 0xFFFFFFFF
. The assigned value 0x100000000
is equal to 0xFFFFFFFF + 1
and over the value range of the unsigned int
, so the highest bit will be cut, all the remaining bits are 0. So that the value of the variable a
is 0 in the memory.
The variable b
is a signed int
, and its value is saved in the memory with complement format. The count of its value bit is 31, but the assigned value is 0xFFFFFFFF
that has 32 bits so that the highest bit will be overwritten to 1, then we get the complement value of the b
is 0xFFFFFFFF
. Converting to the original code, we can get the value is -1
.