Overview
In this module, we will learn how to access files from MATLAB using path strings and built-in MATLAB dialog windows.
This Module is broken down into the following sections:
- The MATLAB Current Folder
- The Path String
- The MATLAB search path
- Getting File Paths
- Exploring Folder Contents
Learning Objectives
- Define a path string
- Be able to create your own path strings, including paths to files and folders
- Define file separator
- Discriminate between path strings from different operating systems based on the file separator used
- Describe the MATLAB search path
- Be able to add or remove folders to the MATLAB search path
- Be able to use all of the Important Functions listed in this module
Important Terminology
- Directory – a folder on your computer
- Path – a string that contains the unique location of a file or folder
- Wildcard character – a character that substitutes for any other character or character range in a path string. In MATLAB, this character is an asterisk (*)
- Structure Array – a hierarchal data type in MATLAB that groups related data using data containers called fields. Each field can contain data of any type or size.
Important Functions
- cd – change current folder
- dir – List folder contents
- fileparts – Parts of file name and path
- filesep – File separator for current platform
- fullfile– Build full file name from parts
- numel – number of array elements
- pwd – identify current folder
- uigetfile – Open File selection dialog box
- uigetdir – Open Folder selection dialog box
- userpath – returns the user path
The MATLAB Current Folder
TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents
When you launch MATLAB, it automatically opens a folder on your hard drive. This folder is known as the “Current Folder” and there is a window in MATLAB that shows the content of this folder.
You can find a representation of the location of the Current Folder on your hard drive just below the Ribbon Tool Strip in the Address field of MATLAB:
- This is example shows the location of the MATLAB folder, which is the default folder that MATLAB opens.
You can change the Current Folder by clicking on the “Browse for Folder” icon that is beside the address bar (right next to that blue folder).
The Path String
TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents
A path string contains the unique location of a file or folder. This string represents a flattened directory tree hierarchy in which the outermost folder is found on the far left of the string and the innermost folder or file is found on the far right of the string.
Calling the function pwd
returns the current folder path to the command window.
>>pwd
ans =
/Users/ernesto/Documents/MATLAB
- As you can see, the path is a character array
You can see this same path as a tooltip if you hover your mouse over the blue folder in the address field under the Ribbon tool strip. If you right-click on that blue folder, you can copy the path to the clipboard.
The File Separator
The file separator is the character that separates individual folder and file names in a path string. This character differs between MAC and PC.
You can use the function filesep to return the correct slash for your operating system:
>>filesep
ans =
/
On a Mac (and unix), folders and files are separated by a forward slash. On PCs, they are separated by a backward slash (“\”) and drives are indicated by a colon (“:”) or a double backward slash(“\”).
So, the path string “/Users/ernesto/Documents/MATLAB” is a MAC (or UNIX) specific string that is pointing to the MATLAB folder, which is in the documents folder, which is in the ernesto folder, which is in the Users folder, which is in the root folder. This path can also be visualized as a directory tree hierarchy, where each folder is contained in the folder listed directly above it, as shown below:
The parts of a path string
The function fileparts
accepts a path string as an input and returns two outputs as follows:
[path, folder_name] = fileparts(pwd)
path =
/Users/ernesto
folder =
Documents
- The variable
path
is the full path string to the folder andfolder_name
is the name of the current folder.
Using folder_name
, we can test whether we are in the MATLAB folder by invoking the strcmp
function:
if strcmp(folder_name, 'MATLAB')
display('Great! You are in the right folder.')
else
display('Whoops, you are not in the MATLAB folder. Please Navigate to the MATLAB folder')
end
The MATLAB search path
TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents
The MATLAB search path is a collection of paths that tells MATLAB where to look for functions. This path is preset to include all of the functions that are built-in to MATLAB and all of the installed toolboxes.
To see the MATLAB search path, click on the “Set Path”” button in the Home Tab of the Ribbon Interface
In the dialog window that appears, you can add your own personal folders to the search path so that MATLAB can find any functions that you write yourself or that you download from the internet. These path is known as the userpath. By default, MATLAB automatically includes a startup folder called MATLAB that can be found in your Documents folder. To see the path to your startup MATLAB folder simply type in the command window:
userpath
Getting File Paths
TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents
Manually Selecting Files
Both Mac and PC allow you to select files in their system file browsers and copy the file paths manually. You then simply paste the path as a character array in MATLAB and then assign to a variable.
To do this:
- On the Mac, option-right-click a file and select the menu item to capture item as a “pathname.”
- On a PC, shift-right-clickand select the “copy as path” menu-item
When manually setting file paths, don’t forget they need to be character arrays, so they need be bracketed by single quotes:
file_path = '/Users/ernesto/Documents/MATLAB'
Generating Individual File Paths
To open files and import data into MATLAB programmatically, you can use the function uigetfile. This function returns the file path from a user-selected file.
For example, consider the following folder containing 10 .csv
files (a spreadsheet format)
The function uigetfile
calls the system file browser that allows you to navigate to this Weather Data folder, and choose, for example, the “w2013.csv” file. The function then outputs the following:
>>[Filename, Pathname] = uigetfile('*.csv')
Filename =
'w2013.csv'
Pathname =
'/Users/ernesto/Documents/Unit 1/weather_data/'
- The input
*.csv
indicates that the dialog window should highlight only.csv
files. - The asterisk in this context is a wildcard character that means “any name” DOT csv.
NOTE: After running uigetfile
, the function returns just the filename and the path of the file selected. The file itself is NOT opened. Filename
contains the name of the file, and Pathname
contains the path to the file (which is basically the folder where the file is stored). If you cancel the dialog box, these variables will be set to ZERO.
Generating a Full Path
TOP || Getting File Paths >> Generating Individual File Paths |Generating a Full Path | Exploring Folder Contents | Load ALL Files
To generate the full path (the programmatic location and name of the file), you use the function fullfile:
> full_file_path = fullfile(Pathname, Filename)
full_file_path =
'/Users/ernesto/Documents/Unit 1/weather_data/w2013.csv'
full_file_path
is a character array that contains the folder and file name. This is called the file path.
Using this file path, we can now read in the file.
Challenge: What does the function fileparts
return when you input full_file_path
Exploring Folder Contents
TOP | The MATLAB Current Folder | The Path String | The MATLAB search path | Getting File Paths | Exploring Folder Contents
The uigetfile
function is useful if you want to select an individual file. But what if you have a whole folder of files that you want to import. It is far easier to simply identify the folder for MATLAB and then have MATLAB automatically generate file paths for each file inside that folder.
Folder Dialog and Changing the Current Folder
To get the path to a folder, use the function uigetdir
:
>> weather_folder = uigetdir(pwd,'Find the Weather Folder')
weather_folder =
'/Users/ernesto/Documents/Unit 1/weather_data'
To change the MATLAB current folder to the selected folder, use the function cd:
cd(weather_folder)
pwd
Finding Content in a Folder
The function dir returns information on the contents of a folder. In this example, we are going to use the wildcard character *
to specify that we want to find a file that has the number 13 in it. Since we do not indicate a path, MATLAB searches in the Current Folder (which we just changed to be the weather_folder).
>>content = dir('*13.*')
content =
name: 'w2013.csv'
date: '20-Jun-2015 15:18:28'
bytes: 3256
isdir: 0
datenum: 7.3614e+05
The variable contents
is a data type called a structure array. A structure array is a data type that groups related data using data containers called fields. Each field can contain data of any type or size. You access a field using dot notation.
The structure array contents
has 5 fields:
- name – the name of the item
- date – the modification date of the file
- bytes – the size of the file
- isdir – logical array that contains TRUE if the item is a folder
- datenum – Modification date as serial date number
To access the contents of a field, use dot notation:
>>contents.name
ans =
w2013.csv
This syntax returns the file name as a character array.
Using the Wildcard character
The asterisk serves as a wildcard chapter. It means any character, and any number of characters.
For example, to find all the files in the current folder that end with a .csv
, use this syntax:
>>contents = dir('*.csv')
contents =
10x1 struct array with fields:
name
date
bytes
isdir
datenum
The wildcard character in this case covers any filename.
The contents structure
Note that contents
is 10X1 structure array. This means that 10 files with the .csv
extension were found in the current folder.
The function numel returns the number of elements in an array.
>>file_count = numel(contents)
file_count =
10
To return the fifth element in the structure array, you index as normal:
>>contents(5)
ans =
name: 'w2009.csv'
date: '20-Jun-2015 15:18:30'
bytes: 3163
isdir: 0
datenum: 7.3614e+05
The result is a 1X1 structure array
To extract the name of the 6th file, you combine indexing and dot notation as follows:
contents(6).name
ans =
w2010.csv
The result here is a character array (the content inside the name field in the 6th element of the structure array)
To return the names of ALL the files, use this syntax:
contents.name
Notice the above syntax simply returns each name sequentially into the command window, overwriting ans each time. To capture all of the names in a cell array, you need to us the curly brackets.
file_names = {contents.name}' % create a cell array that contains the file names
This syntax returns filenames
is a 10X1 cell array.
Load ALL Files
TOP || Getting File Paths >> Generating Individual File Paths |Generating a Full Path | Exploring Folder Contents | Load ALL Files
Use a FOR LOOP to generate the full paths for each file name, read in each table, and concatenate to the previous table.
Remember, numel
returns the number of elements in an array and we already have the full path to the weather folder (weather_folder):
T = table
for n = 1:numel(contents)
file_path = fullfile(weather_folder, file_names{n})
t = readtable(file_path);
T = [T; t];
end
Notice that this For Loop, the table variable, T, incrementally grows with each call of the for loop.
While this works satisfactorily for a small dataset, you run the risk of running into memory problems if you do not preallocate.
Better coding practice would be to first preallocate a table variable with 300 empty rows.
- How would you capture the output from the above for loop into a cell array called full_paths?