Data Types
Author(s): PlatinumMaster
Introduction
In computing, a data type is an attributed piece of data. The attribute tells the computer how to interpret the data associated with it, allowing it to perform a certain set of operations (and explicitly those operations).
This reference guide aims to provide you information on the various data types used on the Nintendo DS, more specifically the ARMv5T architecture.
Number Representations
Before we can go into data types, you must understand number representations, as they make the entire process easier.
Number Bases
A number base (written as base-) is the number of unique digits that can be used to represent a numerical value, starting from zero. For example, the numbers we often use in everyday life (0, 1, 2, ..., 10, ...) are written in base-10
notation, meaning there are 10 unique symbols used to represent a digit of a number (in this case, 0 - 9).
When performing mathematical operations such as addition and subtraction, you may have a scenario where you add two digits and get a value higher than that a digit can support. A good example would be adding 9 + 9 together; since we only have 10 digits to assign to a given digit, we cannot use a single digit to represent the result of 9 + 9.
To solve this problem, we can utilize multiple digits to represent this value, with an operation known as a carry
. A carry
is a representation of the fact that when the sum in a column exceeds the highest digit in the numeral system, one has to carry over the excess to the next column. An example can be seen below with 9 + 9.
Changing Bases
Given two bases, to convert a number from base- to base-, follow this simple algorithm:
- Convert from base- to base 10 by multiplying each digit with the base raised to the power of the digit number (starting from rightmost digit), then summing it together.
- Divide the base-10 representation of by until the quotient is 0, and calculate the remainder each time. If the number you get at a step is a decimal, you must floor/truncate it at each step (meaning floor(15.2) = 15, and floor(15.9) == 15), then use that floored value to calculate the next one.
- The destination base digits are the calculated remainders, where the first remainder is the least significant bit, and the last remainder is the most significant bit.
Bit
A bit is a binary value which is the underpinning of all computer data types. It is analogous to a light switch, where turning on and off the light are two distinct options; as such, a single bit can represent two distinct values, 0
(representing off), or 1
(representing on).
Binary
Bits can be combined into a binary string
of any length, which allows you to represent any number as a series of 0
s and 1
s. This representation is also known as base-2
, as there are only two symbols that are used to represent numbers.
Converting To Base-2 From Base-10
For an example, let us take the base-10 number 62, and convert it to binary.
If we let bin(x)
represent the operation of taking and converting it to binary, the operation is as follows:
- . Since 62 is evenly divisible by 2, there is no remainder (so ).
- . Since 31 is not evenly divisible by 2, there is a remainder (so ).
- . Since 15 is not evenly divisible by 2, there is a remainder (so ).
- . Since 7 is not evenly divisible by 2, there is a remainder (so ).
- . Since 3 is not evenly divisible by 2, there is a remainder (so ).
- . Since 1 is not evenly divisible by 2, there is a remainder (so ).
Since floor(0.5) = 0, we are done. The final binary representation is the remainders being placed in order from right to left (meaning 0 is the right most, in this scenario). As such, = 111110
.
We can also verify that the number of bits we got is correct by taking the logarithm of 62 (with respect to base-2). , meaning we would need to perform 6 division operations to get our representation (which is what we just did above). This means that to store the value 62 in base-2, you would need 6 bits.
Converting From Base-2 to Base-10
Going from base-2 to base-10 is also a fairly standard procedure. Let us use the binary string 01101101
for an example.
Each of these bits, as stated before, can be converted to base-10 value by multiplying the digit by it's base raised to the power of the index of the digit. We can then take the sum of all of these base-10 values, and get the final value. Note that the indexing of the bits starts at 0, and the 0th bit is the right-most bit.
- At bit index 0, we have a value of
1
. . - At bit index 1, we have a value of
0
. . - At bit index 2, we have a value of
1
. . - At bit index 3, we have a value of
1
. . - At bit index 4, we have a value of
0
. . - At bit index 5, we have a value of
1
. . - At bit index 6, we have a value of
1
. . - At bit index 7, we have a value of
0
. .
Summing the values, we get:
Therefore, .
The keen-eyed amongst you may have noticed that when you multiply the digit by (where is the index of the bit), you will be multiplying by either 0 or 1. Given a number :
- .
- .
As such, you can simplify the binary rules: if a bit at index is 1
, the value of that bit is just . If it is 0, then the value of the bit is also 0.
This allows you to derive the following decimal values, assuming the bit at index is 1:
- The digit in the 0th index, which is the right most digit, has a decimal value of .
- The digit in the 1st index, which is the second right most digit, has a decimal value of .
- The digit in the 2nd index, which is the third right most digit, has a decimal value of .
- The digit in the 3nd index, which is the fourth right most digit, has a decimal value of .
- The digit in the th index, which is the right most digit, has a decimal value of .
Signed and Unsigned Primitives
So far, what we have described seems as if it would only produce positive values.
- When we are talking about bit indices, we can only index what is there, which is positively ranged from 0 to the bit count - 1, inclusive.
- Furthermore, there is no way to have (or, any function) spit out a negative number unless you utilize the complex number system (which is not what we are working with).
If that is the case, how is it possible to perform operations involving negative numbers?
Up until now, we have been working with unsigned
numbers. Unsigned numbers are numbers that do not have any data that can represent the sign of the number (so, they are treated as positive numbers only). For us to have an signed
number, which does include this information, we need to find a way to store this data.
Signed numbers on ARMv5T utilize the most significant bit in a byte to designate that the number has a sign.
Hexadecimal (Base-16)
Hexadecimal is a number system which has 16 possible digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A (10), B (11), C (12), D (13), E (14), F (15).
Numbers in this format can often be prefixed with 0x
to denote that they are in hexadecimal. Although this is not necessary, it helps to distinguish the difference between decimal numbers, and hexadecimal numbers using purely symbols which can be misconstrued as decimal symbols (for example, 0x33
vs 33
are not the same, but dropping 0x
would make them appear to be).
Converting From Decimal to Hexadecimal
For an example, let us take the decimal number 62, and convert it to hex.
If we let hex(x)
represent the operation of taking and converting it to hexadecimal, the operation is as follows:
- . Since 62 is not evenly divisble by 16, there is a remainder (so ). The decimal number 14 is equivalent to the hexadecimal number E, so
E
is our value. - . Since 3 is not evenly divisble by 16, there is a remainder (so ). The decimal number 3 is equivalent to the hexadecimal number 3, so
3
is our value.
floor(0.1875) = 0
, so we are done. This means that = 0x3E.
Converting From Binary to Hexadecimal
Say we are tasked with turning the string 10110100
into a hexadecimal number. Intuitively, you would assume that you could just convert the binary string to decimal, then perform the steps mentioned above. This can be done, however there is a far simpler (and quicker) method.
Using our logic from before, the hexadecimal digit at index has a base decimal value of . As we know from before, . This means that we can represent the base decimal value at index as -- that is, each hexadecimal digit will correspond to a group of 4 bits.
What this means is that we can effectively take our string 10110100
, split it up into groups of four bits, convert each group to hexadecimal, then place the digits together to get the hexadecimal value. So, let's try it.
Our two groups are 1011
and 0100
, which we will treat as two individual binary strings.
- First we find
hex(1011)
. The decimal value of 1011 = , which corresponds toB
in hexadecimal. - Then we find
hex(0100)
. The decimal value of 0100 = , which corresponds to4
in hexadecimal. So, the final hexadecimal value isB4
, or0xB4
with the prefix.
Data Types
Byte
A byte
is a group of exactly 8 bits (and strictly no larger) in a sequential order.
You can think of a byte as a single binary string with exactly 8 bits. For example:
00000000
is 1 byte.10000000
is 1 byte.11111111
is 1 byte.
Value Range
As bytes are essentially bitgroups, we can convert them to other representations very easily. Let us take the byte 11111111
for example.
- For decimal:
dec(11111111)
= . hex(11111111)
=0x{hex(1111)}{hex(1111)}
=0xFF
.
Ranges
The values a byte can be are between 00000000
and 11111111
, inclusively.
- In decimal (base-10), this is [0, 255].
- In hexadecimal (base-16), this is [0, 0xFF].
Bytes are typically displayed in hexadecimal, especially in hex editors.