Managing Files and Folders

Overview

In this module, we will learn how to access files from MATLAB using path strings and built-in MATLAB dialog windows.

This Module is broken down into the following sections:

Learning Objectives

  • Define a path string
  • Be able to create your own path strings, including paths to files and folders
  • Define file separator
  • Discriminate between path strings from different operating systems based on the file separator used
  • Describe the MATLAB search path
  • Be able to add or remove folders to the MATLAB search path
  • Be able to use all of the Important Functions listed in this module

Important Terminology

  • Directory – a folder on your computer
  • Path – a string that contains the unique location of a file or folder
  • Wildcard character – a character that substitutes for any other character or character range in a path string. In MATLAB, this character is an asterisk (*)
  • Structure Array – a hierarchal data type in MATLAB that groups related data using data containers called fields. Each field can contain data of any type or size.

Important Functions

  • cd – change current folder
  • dir – List folder contents
  • fileparts – Parts of file name and path
  • filesep – File separator for current platform
  • fullfile– Build full file name from parts
  • numel – number of array elements
  • pwd – identify current folder
  • uigetfile – Open File selection dialog box
  • uigetdir – Open Folder selection dialog box
  • userpath – returns the user path

The MATLAB Current Folder

TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents

When you launch MATLAB, it automatically opens a folder on your hard drive. This folder is known as the “Current Folder” and there is a window in MATLAB that shows the content of this folder.

You can find a representation of the location of the Current Folder on your hard drive just below the Ribbon Tool Strip in the Address field of MATLAB:

  • This is example shows the location of the MATLAB folder, which is the default folder that MATLAB opens.

You can change the Current Folder by clicking on the “Browse for Folder” icon that is beside the address bar (right next to that blue folder).

The Path String

TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents

A path string contains the unique location of a file or folder. This string represents a flattened directory tree hierarchy in which the outermost folder is found on the far left of the string and the innermost folder or file is found on the far right of the string.

Calling the function pwd returns the current folder path to the command window.

>>pwd

ans =

/Users/ernesto/Documents/MATLAB

  • As you can see, the path is a character array

You can see this same path as a tooltip if you hover your mouse over the blue folder in the address field under the Ribbon tool strip. If you right-click on that blue folder, you can copy the path to the clipboard.

The File Separator

The file separator is the character that separates individual folder and file names in a path string. This character differs between MAC and PC.

You can use the function filesep to return the correct slash for your operating system:

>>filesep

ans =

/

On a Mac (and unix), folders and files are separated by a forward slash. On PCs, they are separated by a backward slash (“\”) and drives are indicated by a colon (“:”) or a double backward slash(“\”).

So, the path string “/Users/ernesto/Documents/MATLAB” is a MAC (or UNIX) specific string that is pointing to the MATLAB folder, which is in the documents folder, which is in the ernesto folder, which is in the Users folder, which is in the root folder. This path can also be visualized as a directory tree hierarchy, where each folder is contained in the folder listed directly above it, as shown below:

The parts of a path string

The function fileparts accepts a path string as an input and returns two outputs as follows:

[path, folder_name] = fileparts(pwd)

path =

/Users/ernesto


folder =

Documents

  • The variable path is the full path string to the folder and folder_name is the name of the current folder.

Using folder_name, we can test whether we are in the MATLAB folder by invoking the strcmp function:

if strcmp(folder_name, 'MATLAB')
    display('Great! You are in the right folder.')
else
    display('Whoops, you are not in the MATLAB folder. Please Navigate to the MATLAB folder')
end

The MATLAB search path

TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents

The MATLAB search path is a collection of paths that tells MATLAB where to look for functions. This path is preset to include all of the functions that are built-in to MATLAB and all of the installed toolboxes.

To see the MATLAB search path, click on the “Set Path”” button in the Home Tab of the Ribbon Interface

In the dialog window that appears, you can add your own personal folders to the search path so that MATLAB can find any functions that you write yourself or that you download from the internet. These path is known as the userpath. By default, MATLAB automatically includes a startup folder called MATLAB that can be found in your Documents folder. To see the path to your startup MATLAB folder simply type in the command window:

userpath


Getting File Paths

TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents

Manually Selecting Files

Both Mac and PC allow you to select files in their system file browsers and copy the file paths manually. You then simply paste the path as a character array in MATLAB and then assign to a variable.

To do this:

When manually setting file paths, don’t forget they need to be character arrays, so they need be bracketed by single quotes:

file_path = '/Users/ernesto/Documents/MATLAB'

Generating Individual File Paths

To open files and import data into MATLAB programmatically, you can use the function uigetfile. This function returns the file path from a user-selected file.

For example, consider the following folder containing 10 .csv files (a spreadsheet format)

The function uigetfile calls the system file browser that allows you to navigate to this Weather Data folder, and choose, for example, the “w2013.csv” file. The function then outputs the following:

>>[Filename, Pathname] = uigetfile('*.csv')

Filename =

    'w2013.csv'


Pathname =

    '/Users/ernesto/Documents/Unit 1/weather_data/'

  • The input *.csv indicates that the dialog window should highlight only .csv files.
  • The asterisk in this context is a wildcard character that means “any name” DOT csv.

NOTE: After running uigetfile, the function returns just the filename and the path of the file selected. The file itself is NOT opened. Filename contains the name of the file, and Pathname contains the path to the file (which is basically the folder where the file is stored). If you cancel the dialog box, these variables will be set to ZERO.

Generating a Full Path

TOP || Getting File Paths >> Generating Individual File Paths |Generating a Full Path | Exploring Folder Contents | Load ALL Files

To generate the full path (the programmatic location and name of the file), you use the function fullfile:

> full_file_path = fullfile(Pathname, Filename)

full_file_path =

    '/Users/ernesto/Documents/Unit 1/weather_data/w2013.csv'

  • full_file_path is a character array that contains the folder and file name. This is called the file path.

Using this file path, we can now read in the file.

Challenge: What does the function fileparts return when you input full_file_path


Exploring Folder Contents

TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents

The uigetfile function is useful if you want to select an individual file. But what if you have a whole folder of files that you want to import. It is far easier to simply identify the folder for MATLAB and then have MATLAB automatically generate file paths for each file inside that folder.

Folder Dialog and Changing the Current Folder

To get the path to a folder, use the function uigetdir:

>> weather_folder = uigetdir(pwd,'Find the Weather Folder')

weather_folder =

    '/Users/ernesto/Documents/Unit 1/weather_data'


To change the MATLAB current folder to the selected folder, use the function cd:

cd(weather_folder)
pwd

Finding Content in a Folder

The function dir returns information on the contents of a folder. In this example, we are going to use the wildcard character * to specify that we want to find a file that has the number 13 in it. Since we do not indicate a path, MATLAB searches in the Current Folder (which we just changed to be the weather_folder).

>>content = dir('*13.*')

content = 

       name: 'w2013.csv'
       date: '20-Jun-2015 15:18:28'
      bytes: 3256
      isdir: 0
    datenum: 7.3614e+05

The variable contents is a data type called a structure array. A structure array is a data type that groups related data using data containers called fields. Each field can contain data of any type or size. You access a field using dot notation.

The structure array contents has 5 fields:

  • name – the name of the item
  • date – the modification date of the file
  • bytes – the size of the file
  • isdir – logical array that contains TRUE if the item is a folder
  • datenum – Modification date as serial date number

To access the contents of a field, use dot notation:

>>contents.name
ans =

w2013.csv

This syntax returns the file name as a character array.

Using the Wildcard character

The asterisk serves as a wildcard chapter. It means any character, and any number of characters.

For example, to find all the files in the current folder that end with a .csv, use this syntax:

>>contents = dir('*.csv')

contents = 

10x1 struct array with fields:

    name
    date
    bytes
    isdir
    datenum

The wildcard character in this case covers any filename.

The contents structure

Note that contents is 10X1 structure array. This means that 10 files with the .csv extension were found in the current folder.

The function numel returns the number of elements in an array.

>>file_count = numel(contents)

file_count =

    10

To return the fifth element in the structure array, you index as normal:

>>contents(5)

ans = 

       name: 'w2009.csv'
       date: '20-Jun-2015 15:18:30'
      bytes: 3163
      isdir: 0
    datenum: 7.3614e+05

The result is a 1X1 structure array

To extract the name of the 6th file, you combine indexing and dot notation as follows:

contents(6).name

ans =

w2010.csv

The result here is a character array (the content inside the name field in the 6th element of the structure array)

To return the names of ALL the files, use this syntax:

contents.name

Notice the above syntax simply returns each name sequentially into the command window, overwriting ans each time. To capture all of the names in a cell array, you need to us the curly brackets.

file_names = {contents.name}' % create a cell array that contains the file names

This syntax returns filenames is a 10X1 cell array.

Load ALL Files

TOP || Getting File Paths >> Generating Individual File Paths |Generating a Full Path | Exploring Folder Contents | Load ALL Files

Use a FOR LOOP to generate the full paths for each file name, read in each table, and concatenate to the previous table.

Remember, numel returns the number of elements in an array and we already have the full path to the weather folder (weather_folder):

T  = table
for n = 1:numel(contents)
   file_path =  fullfile(weather_folder, file_names{n})
   t = readtable(file_path);
   T = [T; t];
end

Notice that this For Loop, the table variable, T, incrementally grows with each call of the for loop.

While this works satisfactorily for a small dataset, you run the risk of running into memory problems if you do not preallocate.

Better coding practice would be to first preallocate a table variable with 300 empty rows.

  • How would you capture the output from the above for loop into a cell array called full_paths?

END

TOP | The Path String | The MATLAB search path |