Binary: a tale of ones and zeros

Overview

In this module we will learn about binary numbers, how variables are stored in Computer Memory and what this has to do with variable data types.

It’s ones and zeros all the way down

This module is broken down into the following sections:

Learning Objectives

  • Know how to count in binary and how to convert binary numbers to decimal (and vice vera)
  • Be able to describe how computers store information (hint: it’s in binary)
  • Know and be able to discuss the advantages and disadvantages of using these following MATLAB numeric classes: double, single, uint8, uint16.
  • Define the term integer saturation and explain why care must be taken when performing integer math
  • Define ASCII code
  • Explain the concept of typecasting and be able to typecast from one variable class to another
  • Know how to use the listed important functions
  • Define the listed important terminology

Important MATLAB Functions

  • dec2bin – Convert decimal to binary number in character array
  • bin2dec – Convert binary number to decimal number
  • double – convert to double precision
  • single – convert to single precision
  • uint8 – convert the array into unsigned 8-bit (1-byte) integers
  • uint16 – convert the array into unsigned 16-bit (2-byte) integers
  • logical – convert the array to a logical class
  • char – convert the array to a character class
  • num2str – Convert character arrays to numeric arrays

Important Terminology

  • base 10 numeral system – aka decimal
  • base 2 numeral system – aka binary
  • bit – an elemental unit of information in computing
  • byte – 8 bits. The smallest addressable memory element in most computers.
  • ASCII – a character encoding standard
  • Variable Class – a class identifies the properties of the variable such as the number of bytes required to store that variable and the possible range of values.
  • Dynamic Range – the ratio between the largest and smallest values possible
  • Bit depth – the number of bits reserved for each element of a variable
  • Type casting – the process of converting the contents of a variable from one class to another class

Binary Numbers

TOP | Binary Numbers | Computer Memory | Variable Class and Memory

Binary numbers use the base 2 numeral system. In base 2, numeric values are represented using 2 different symbols (typically 0 and 1).This means that you can use only two digits, 0 and 1, to represent all numeric values.

Binary numbers are often clustered in groups of 8 digits (see next section). From right to left, each position in this cluster has an equivalent bit-number, starting from 0, that indicates increasing powers of 2:

So, the bit on the farthest right corresponds to 20, while the bit on the farthest left (in this example) corresponds to 27. Thus, to create any numeric value, you need to create an equation that contains these powers of 2. For example:

3 = 2^1 +  2^0

In this equation we have a two to the power of 1 and a two to the power of 0, which correspond to the first two bit positions. In binary, this would be written as 11, or 0000 0011. The following table shows the binary representation of a series of decimal numbers:

Challenge 0

What does the MATLAB function dec2bin do? Try the following the function call:

dec2bin(3,8)

  • what is returned?
  • what is the first input?
  • what is the second input?

Answer

Computer Memory

TOP | Binary Numbers | Computer Memory | Variable Class and Memory

At the heart of the matter, computers store information using a sequence of ones and zeros.

1 and 0s - the language of computers
1 and 0s – the language of computers
  • A bit is an elemental unit of information in computing (typically treated as a 1/0 or true/false). It is a single 0 or 1 that can represent basic information such as on/off, plus/minus, or as component in the base-2 numeral system
  • A byte contains 8 bits and is the smallest addressable memory element in most computers. This means that a computer cannot store anything smaller than a byte (even if all you need to store is just 1 bit of information).

This is the reason why you will often see binary numbers preceded by a series of zeros. For example, when indicating the value one in a byte, you precede the one with 7 zeros as such:

0000 0001

It takes one byte of computer memory just to save the number one.

Bit Depth

Although one byte is the smallest addressable memory element, you can allocate more than one byte to a memory element.

Bit depth is a term to indicate how many bits of memory are allocated to a memory element. For example:

  • 8-bit: contains 1 byte per element
  • 16-bit: contains 2 bytes per element

Confusingly, some acquisition devices, such as some cameras on microscopes, can acquire information that is not easily divisible into bytes. For example, there are cameras that acquire 12-bit images. In this case, those images are stored computer memory as 16-bit, even though there are only 12-bits of information.

As we will see later in the course, this disconnect between the way the memory is acquired and the way the memory is stored can cause display issues, which are easily corrected if you understand bit depth.

Variable Class and Memory

TOP | Binary Numbers | Computer Memory | Variable Class and Memory

As we have previously discussed, variables represent storage locations in the computer’s memory. When dealing with very large numbers or very large arrays, it is critical to understand how MATLAB allocates memory when assigning values to a variable.

We will discuss the following Classes:

Numeric Class

TOP || Variable Class and Memory >> Numeric Class | Character Class | Logical Class

Numeric class variables store numbers. There are many different numeric classes which can be broadly broken down into floating vs integer classes. Floating classes can have significant digits (i.e. numbers after the decimal point), whereas the integer classes solely handle whole numbers.

MATLAB Numeric Classes include signed and unsigned integers, and single- and double precision floating-point numbers. Each class has a different consequence on memory. You can find a list of the numeric types available in MATLAB here.

In this module, we will discuss the following MATLAB numeric classes;

Double Precision

TOP || Variable Class and Memory >> Numeric Class >> Double Precision | Single Precision | Integers

The default MATLAB numeric class is double. Double-precision variables, use 64 bits (8 bytes) of memory per element in an array. Due to this amount of memory allocation, they can accurately represent very large numbers. This is also known as having a very large dynamic range, or a large ratio between the largest and smallest values possible.

  • r values to approximately 15–17 significant decimal digits

Because of the amount of memory allocated per element in an array, double-precision variables can consume a lot of more memory. The following is an illustration of how a double precision number is stored in memory:

As you can see in the illustration, there are 64 different positions. In each position, you can store a 1 or a zero. The first position is the sign position. The rest of the positions handle storing the value of the number.

In MATLAB, you can see how much memory a double precision variable consumes with the following example:

>>a = 1
>>b = 1e24
>>c = 1:10

Notice that the variables a and b, which respectively have the values of one and one septillion, use only 8 bytes of memory, while the variable c, which is an array of 10 numbers, takes 80 bytes to store in memory.

Single Precision

TOP || Variable Class and Memory >> Numeric Class >> Double Precision | Single Precision | Integers

The single data type requires 32 bits per element to store in memory — half as much as a double—and is ideal for storing and processing real number values when you don’t require the same level of accuracy as that provided by double precision.

The default numeric class for MATLAB is double. To convert from the double class to another class you must use a type-casting function, such as single.

For example:

>> d = single(c)

d =

  1×10 single row vector

     1     2     3     4     5     6     7     8     9    10

>> whos('d')
  Name      Size            Bytes  Class     Attributes
  d         1x10               40  single     

The variable d contains the output from the conversion of the variable c into a single class. Note that d requires only half of the Bytes (40) that c requires, even though it contains the exact same values (1:10).

REcall that the function whos displays the properties of the inputted variable name.


Integers

TOP || Variable Class and Memory >> Numeric Class >> Double Precision | Single Precision | Integers

Integer Class. Integer class variables can store only whole numbers in each element. Integer classes typically require far less memory per element than floating class. However, they can only handle a small range of values, such as 0 – 255.

Digital Images are often stored in an unsigned integer class. The most common ones that we will be using for digital images are uint8 and uint16.

There are other integer classes (int8, int16), which are signed (i.e. can have negative values), but we will likely not use these classes in our course.

Memory Considerations

TOP || Variable Class and Memory >> Numeric Class >> Integers >> Memory Considerations | Saturation | Math With Integer Classes

The main reason to use an integer class is to save memory. Let’s convert d into an 8-bit integer:

e = uint8(1:10)

e =

   1×10 uint8 row vector

    1    2    3    4    5    6    7    8    9   10

whos('e')

  Name      Size            Bytes  Class    Attributes
  e         1x10               10  uint8              

Compared to the double c, or the single d, the uint8 e requires only 10 bytes of memory. This is because uint8 arrays require only 1 byte of memory per element.

Saturation

TOP || Variable Class and Memory >> Numeric Class >> Integers >> Memory Considerations | Saturation | Math With Integer Classes

Integers have much lower possible maximum values than their floating-point counterparts (lower dynamic ranges). Watch what happens when you type cast a very large number (1e24) to an 8-bit unsigned integer:

f = uint8(1e24)

f =

  255

whos('f')

  Name      Size            Bytes  Class    Attributes
  f         1x1                 1  uint8    

Notice that when I converted 1 septillion to an 8-bit unsigned integer, the value was clipped to 255 and the variable (f) was allocated 1 byte of memory.

Remember, for an 8-bit unsigned integer, the maximum value you can have is 255. In computer memory, an 8-bit integer has only 8 positions to store each bit. So, in binary, 255 would be represented as follows:

1111 1111

Challenge 1

How much memory is allocated for a 16-bit unsigned number? Click to see answer

Math With Integer Classes

Although useful for conserving memory, care must be taken when performing math with integer classes.

For example, consider the following:

a = uint8([2 4 16 32 64 128 255])
b = a + 10

b =

   12   14   26   42   74  138  255

Notice that every value, except for 255, increases by 10. This is called integer saturation or ‘clamping’ the value to the class maximum (255).

For a more detail discussion of Integer and Single-Precision Math, please refer to this article.

Challenge 2

What do you think will happen if you add a 200 more to b? i.e b + 200?

What do you think will happen if you subtract 10 from a? i.e. a-10?

Answer


Character Class

TOP || Variable Class and Memory >> Numeric Class | Character Class | Logical Class

The character class handles characters (letters, numbers, spaces, etc.). So, how are characters stored in computer memory using binary?

The answer is ASCII. ASCII stands for the “American Standard Code for Information Interchange”. ASCII is a “character encoding scheme” (basically a look-up table) where each character of text has a numeric equivalent. Any text that you see on a computer screen (or on your phone) has a numeric equivalent, even the commas, periods and emojis. And even the characters for numeric digits, like "1".

For example, the character '1' is stored in memory as follows:

What about the character array 'hello'?

ch = 'hello'

Same deal, but as a vector:

In each case, it is the numeric ASCII code that is stored in memory, instead of the actual character.

Type casting – character class

TOP || Variable Class and Memory >> Character Class | Type casting | ASCII | Upper Case vs Lower Case

This encoding scheme for character arrays can have important implications. Consider the following:

  n = '1'

n+1

ans =

    50

So,

‘1’ + 1 = 50???

What’s going on here?

To understand this result, simply review the ASCII code. Remember, the character '1' is actually stored in memory as 49. When you use the syntax to perform a mathematical operation on a character array, MATLAB automatically type casts the character array, so that it can do the math, as follows

49+1 = 50

Similarly, if you add 2 to the variable ch (which contains the character array 'hello'), then you get the following result

ch + 2
ans =

   106   103   110   110   113

MATLAB type casts the entire character array to its ASCII numeric equivalent (see above) and then adds 2.

Even adding two characters returns a similar result:

'A' + 'B'

ans =

   131

Can you guess the ASCII codes for 'A' and 'B' from this result? If you divide 131 by 2 you get 65.5. Considering that ASCII numbers are whole numbers, then it follows that 'A' must be 65 and 'B' 66.

Getting the ASCII code

TOP || Variable Class and Memory >> Character Class | Type casting | ASCII | Upper Case vs Lower Case

We can explicitly type cast characters to their numeric ASCII equivalents by using one of the numeric type class functions (i.e double, single, uint8, etc.).

For example, the function uint16 will get the ASCII code equivalent for any character in a character array:

uint16('aw hell no')

ans =

  1×10 uint16 row vector

    97   119    32   104   101   108   108    32   110   111

The result is a series of numbers that indicate the ASCII numeric code for each letter of the alphabet. Can you spot the ASCII code for 'space'? Count over by letters: The third and eighth letters are spaces (ASCII code 32).


Challenge 3

Why is uint16 the best choice for type casting character arrays (and not, say, uint8?)

ANSWER


You can also type cast an integer array into a character array using the function char

>> char([111    104     32    121    101     97    104])

ans =

    'oh yeah'

So, if you want to know what the character equivalent of a given ASCII code is, simply typecast a number to a character class using the **char** function. For example:

>>char(64)

ans =

@

Or an array of numbers…

>> char(1:50)

ans =

    '    
     
      !"#$%&'()*+,-./012'

That large space before the exclamation point is not empty. These are the non-printing characters such as Line return or Escape that do not show up in screen displays.

Upper Case vs Lower Case

TOP || Variable Class and Memory >> Character Class | Type casting | ASCII | Upper Case vs Lower Case

As you may have guessed, there are different ASCII codes for lower-case letters vs upper case letters. This is the reason why some file systems like UNIX (or MATLAB variable names) are case-sensitive.

Let’s use the function upper to convert b to all caps and then typecast the variables into the integer class uint16.

>>b = 'a':'f'

    b =
        'abcdef'

>>c = upper(b)

    c =

        'ABCDEF'

>> uint16(b)

ans =

  1×6 uint16 row vector

    97    98    99   100   101   102

>> uint16(c)

ans =

  1×6 uint16 row vector

   65   66   67   68   69   70


Notice that we get a different series of numeric ASCII code for the lower vs upper case characters.

Remember, behind the scenes, everything is numbers.

The Matrix is everywhere
The Matrix is everywhere

Properly Converting Numbers to Characters (And Vice Versa)

There are a whole series of functions with 2 in the middle of the function name that simplify converting values from one class to another.

To properly convert a number to its character array equivalent, you can use the function num2str, as follows:

>> chary = num2str(1)

chary =

    '1'

Conversely, to convert a character array of numbers back their numeric equivalent you can use the str2num function:

>> numbery = str2num('1')

numbery =

     1

There are other functions that perform similar actions, such as str2double, but you can review that on your own.


Logical Class

TOP || Variable Class and Memory >> Numeric Class | Character Class | Logical Class

Logical arrays have the fewest possible values for each element. They can contain either a 0 or a 1 (interpreted as TRUE or FALSE).

Thus, they require only 1 byte of memory per element, but really only 1 bit (out of those 8) is being used to represent the value.

Something like this:

Type casting to logical class

MATLAB typecasts variables to the logical class by converting all non-zero elements to TRUE and all zero elements to FALSE

You can use the function logical to typecast to the logical class. For example, the following syntax converts a numeric array to a logical array:

k = -2:1:2
l = logical(k)

The resultant logical array, l, has the same dimensions as k, but has only 1’s and 0’s (or TRUE and FALSE, respectively).

k =
    -2    -1     0     1     2

l =
     1     1     0     1     1

Note that only the zero from k was converted to a FALSE. The rest of the numbers were converted to TRUE.

You can do the something similar with a character array:

logical('hello')

ans =

  1×5 logical array

   1   1   1   1   1


However, you will typically just get all TRUE since all printable characters are represented by an ASCII code that is greater than zero.


SIDEBAR

By the way, the ASCII code 0 codes for the NULL character. A character that basically means “don’t do anything, don’t print, don’t display, nothing”. This is not a character you can type using your keyboard and you will not likely ever purposefully use the NULL character. However, we can force its use for this example, using the following syntax:

logical(['hello' char(0) 'goodbye'])

ans =

  1×13 logical array

   1   1   1   1   1   0   1   1   1   1   1   1   1



Challenge Answers

Challenge 0 Answer

What does the MATLAB function dec2bin do? Try the following the function call:

dec2bin(3,8)

ans =

    '00000011'


  • what is returned? The value 3 in binary
  • what is the first input? The decimal to convert
  • what is the second input? numbers of bits to return

Back

Challenge 1 Answer

2 bytes (16 bits)

Back

Challenge 2 Answer

The last three elements will be saturated (clamped to 255)

 b = b+200

b =

  222  224  236  252  255  255  255

The first two elements of the array will be clamped to the class minimum (0)

a = a-10

a =

    0    0    6   22   54  118  245

Back

Challenge 3 Answer

The char class requires two bytes of memory because there are more than 255 different characters that can be used on a computer. There are not more than 65,536 characters, so you don’t need any more memory (as can be found in single or double).

Back

FIN