Strings

Overview

String arrays were designed to manage collections of character arrays. They have only recently been added to MATLAB, but they have quickly become a powerful tool for managing collections of characters and words.

This module is broken down into the following sections:


Learning Objectives

  • Differentiate between a Character and a String Array.
  • Assign values to string arrays using paired double quotes " "
  • Index string arrays using both ( ) and { }, and understand the outcome of each
  • Manipulate string arrays using mathematical operators such as + and logical operators such as ==
  • Use the functions found on the Character and String Reference Page

Special MATLAB Characters

  • " " – paired double quotes: for creating string arrays
  • [ ] – square brackets: for concatenation
  • ( ) – parentheses: for indexing
  • { } – curly brackets: for extract contents from string array elements

But First a Bit of Background

TOP | Background | String Array Rising | String Array Functions

So, why do we need another variable type just to manage a bunch of letters?

The Trouble With Character Arrays

Recall our issue with multi-row character arrays:

chary = char('one fish', 'two fish', 'red fish', 'blue fish')

chary =

  4×10 char array

    'one fish  '
    'two fish  '
    'red fish'
    'blue fish '

We needed to use the function char to properly pad the character with spaces so that there are an equal number of characters in each row (recall that space is a character). But this becomes burdensome rather quickly. For every new row added to a character array, the entire array needs to be re-padded with spaces. Worse, those extraneous spaces in a large character array take up valuable memory for no reason. There had to be a better way.

The Cell Array Kludge

For the longest time, the cell array was the better way to package character arrays. In fact, this remains one of the best uses for cell arrays. In a cell array, each row of a character array can simply be added to one of the cell elements, no space-padding required. For example:

celly = {'one fish'; 'two fish'; 'red fish'; 'blue fish'}

celly =

  4×1 cell array

    {'one fish' }
    {'two fish' }
    {'red fish' }
    {'blue fish'}

And this worked for many years, but it was a kludge. Cell arrays were designed to package disparate variable types into a single variable. It was easy to package things into cell arrays, but harder to get them out. The indexing was unnecessarily complex. Worse, due to its promiscuous proclivities (you could store anything in a cell array), the cell class had large memory overhead and few functions designed specifically for managing character arrays.


String Array Rising

TOP | Background | String Array Rising | String Array Functions

Enter string arrays – the leaner and meaner variable class designed from the ground up to wrangle character arrays.

The syntax is fairly simple. Instead of single quotes, you use double quotes. And you concatenate using square brackets, like numeric arrays:

stringy = ["one fish", "two fish", "red fish", "blue fish"]'

stringy = 

  4×1 string array

    "one fish"
    "two fish"
    "red fish"
    "blue fish"


  • Notice here that we used the transpose operator ' to orient our new string array, stringy, as a column vector. Also notice that stringy requires less bytes to store in memory than celly: 336 vs 518 bytes, a more than 60% reduction in memory requirement.

Indexing String arrays

String arrays follow the same indexing syntax rules of other complex variable types. Indexing with parentheses () returns a smaller string array, whereas indexing with the curly brackets {} returns the contents of the elements (or in this case, the character array contained within).

For example, to return the first element from stringy as a scalar string array we use the parentheses as follows:

 stringy(1)

ans = 

    "one fish"

  • We can tell that ans is a string array because the characters in the result are bracketed by double quotes: "one fish" (or, by examining the properties of ans in the workspace).

To unpack the first element from stringy and access the character array, we use the curly brackets {} as follows:

stringy{1}

ans =

    'one fish'

  • Here, we can tell that ans is a character array because the outputted result is bracketed by single quotes: 'one fish'.

Appending string arrays

String arrays also have syntax reminiscent of handling numeric arrays. For example, we can easily add more rows to our friend stringy, using the following syntax:

stringy = [stringy; "black fish"; "blue fish"; "old fish"; "new fish"]

stringy = 

  8×1 string array

    "one fish"
    "two fish"
    "red fish"
    "blue fish"
    "black fish"
    "blue fish"
    "old fish"
    "new fish"

  • Note here the use of semi-colons to indicate the addition of elements vertically.

Or, if you want to append one string to another, you can easily do that using the + operator.

stringy = stringy + ',' 

stringy = 

  8×1 string array

    "one fish,"
    "two fish,"
    "red fish,"
    "blue fish,"
    "black fish,"
    "blue fish,"
    "old fish,"
    "new fish,"

  • With this syntax, we can quickly append a comma to the end of each array. Notice in this example that we didn’t even need to have the same sized arrays. MATLAB simply assumed we wanted to append the , to the end of each character arrays in stringy. This syntax is reminiscent of when we add a scalar to a matrix: MATLAB assumes that you want to add the scalar to each element of the vector.

Or, say we want to replace that last comma with an exclamation point, well that’s not too difficult with the following syntax:

stringy{end}(end) = '!'

stringy = 

  8×1 string array

    "one fish,"
    "two fish,"
    "red fish,"
    "blue fish,"
    "black fish,"
    "blue fish,"
    "old fish,"
    "new fish!"


  • Here, we first unpack the contents from the last element in stringy using the curly bracket syntax: {end}. Then we immediately follow that with syntax to access the last character in the character array 'new fish,' by using the parentheses syntax: (end). Then, we assign as normal, replacing the last character (a ,) with an exclamation point (!).

Logical operations on string arrays

You call also use logical operations on a string array in an intuitive manner: For example, if you want to find the element in stringy that contains the character array “two fish”, you would use the following syntax:

stringy == "two fish," 

ans =

  8×1 logical array

   0
   1
   0
   0
   0
   0
   0
   0

Or, if say we want to find all elements that do not contain “blue fish,”, you would use this syntax:

stringy ~= 'blue fish,'

ans =

  8×1 logical array

   1
   1
   1
   0
   1
   0
   1
   1

Notice that in this syntax we are matching the entire contents of each element in stringy to the character array. If they don’t match precisely, we won’t get a match. For example, the following syntax finds no matches…

stringy == 'blue fish'
ans =

  8×1 logical array

   0
   0
   0
   0
   0
   0
   0
   0

…because we didn’t include the , in the character array.


String Array Functions

TOP | Background | String Array Rising | String Array Functions

MATLAB has many functions to manipulate and process string arrays. You can find a list of them on the Characters and String Reference Page. These functions are grouped into the following categories:

  • Create, Concatenate, Convert
  • Determine Type and Properties
  • Find and Replace
  • Join and Split
  • Edit
  • Compare
  • Regular Expressions

If you take a moment to familiarize yourself with these functions, you will find many powerful ways to manipulate string arrays. The following are just a few examples.

Example: Contains

Say we want to find the elements in stringy that contain the character array ‘blue’ in them, we can use the function contains as follows:

contains(stringy,'blue')

ans =

  8×1 logical array

   0
   0
   0
   1
   0
   1
   0
   0

  • notice here we don’t have to match the entire content of an element in stringy, just a portion of the contents. By comparison, we needed a precise match when we used the logical operation using the ==, as illustrated in the previous section.

We can even match just a single character as follows:

contains(stringy,'w')

ans =

  8×1 logical array

   0
   1
   0
   0
   0
   0
   0
   1

  • This syntax returns all elements that contain the letter w in them (two and new, in this case).

Example: Replace

The function replace is a quick way to replace one character array with another. Consider our friend stringy. We can easily replace "fish" with "squish" using the following syntax

squishy = replace(stringy, 'fish', 'squish')

squishy = 

  8×1 string array

    "one squish,"
    "two squish,"
    "red squish,"
    "blue squish,"
    "black squish,"
    "blue squish,"
    "old squish,"
    "new squish!"

Example: Split

The function split splits character arrays at the indicated delimiter (such as space, the default delimiter):

splity = split(squishy)

splity = 

  8×2 string array

    "one"      "squish,"
    "two"      "squish,"
    "red"      "squish,"
    "blue"     "squish,"
    "black"    "squish,"
    "blue"     "squish,"
    "old"      "squish,"
    "new"      "squish!"

Notice that the variable splity is an 8X2 string array. The first column in the array contains the characters preceding the space delimiter and the second column contains the characters that follow the delimiter.

Example: Erase

If we decide we don’t want any punctuation in our string array after all, we can use the erase function to quickly remove the offending punctuation, as follows:

erase(splity, ["," "!"])

ans = 

  8×2 string array

    "one"      "squish"
    "two"      "squish"
    "red"      "squish"
    "blue"     "squish"
    "black"    "squish"
    "blue"     "squish"
    "old"      "squish"
    "new"      "squish"

As you can imagine, the examples are endless. Refer to the reference page as needed or for inspiration

fin.