2  Setting File Paths and Understanding the Current Working Directory

2.1 Learning Objectives

By the end of this lesson, students should be able to:

  • Check Your current working directory in Python.

  • Use relative and absolute paths correctly.

  • Read files using proper path specification.

  • Fix the common file path errors in VS Code.

  • Generate HTML from your notebook

2.2 What is the Current Working Directory?

  • In Python, the current working directory (CWD) is the folder your code is running from.

  • When you use a relative path, it’s relative to this directory.

In python, the os module allow you to interact with the OS and the filesystem. We need to import it before using it

Code
import os

print(os.getcwd())
c:\Users\lsi8012\workspace\Intro_to_programming_for_data_sci_wi25

This tells you where Python is currently looking for files by default.

2.3 Abosolute vs. Relative Paths

2.3.1 Absolute Path

  • Full path from the root of the file system.
Code
path = "/Users/lizhen/Desktop/data/myfile.csv"

2.3.2 Relative Path

  • Path relative to the current working directory.
Code
path = "data/myfile.csv"  # if your CWD is the parent folder of "data"

In your code, it’s highly recommended using relative paths whenever possible for better portability!

2.3.3 Let’s Try: Reading a Data File

Suppose your directory looks like this:

file_directory

In lecture.ipynb, you can do:

Code
import json
with open("Datasets/filtered_movies.json", encoding="utf8") as file:
    movie_data=json.load(file)

movie_data[:2]
[{'Title': '12 Rounds',
  'US Gross': 12234694,
  'Worldwide Gross': 18184083,
  'US DVD Sales': 8283859,
  'Production Budget': 20000000,
  'Release Date': 'Mar 27 2009',
  'MPAA Rating': 'PG-13',
  'Running Time min': 108,
  'Distributor': '20th Century Fox',
  'Source': 'Original Screenplay',
  'Major Genre': 'Action',
  'Creative Type': 'Contemporary Fiction',
  'Director': 'Renny Harlin',
  'Rotten Tomatoes Rating': 28,
  'IMDB Rating': 5.4,
  'IMDB Votes': 8914},
 {'Title': 2012,
  'US Gross': 166112167,
  'Worldwide Gross': 766812167,
  'US DVD Sales': 50736023,
  'Production Budget': 200000000,
  'Release Date': 'Nov 13 2009',
  'MPAA Rating': 'PG-13',
  'Running Time min': 158,
  'Distributor': 'Sony Pictures',
  'Source': 'Original Screenplay',
  'Major Genre': 'Action',
  'Creative Type': 'Science Fiction',
  'Director': 'Roland Emmerich',
  'Rotten Tomatoes Rating': 39,
  'IMDB Rating': 6.2,
  'IMDB Votes': 396}]

If you run into a FileNotFoundError, it means your file path is not specified correctly and the program is unable to locate the file.

2.3.4 Practice Exercise

What if your file directionary look like this:

file_directory

You are now working in lecture2.ipynb. How should you specify the file path so that the data in the JSON file can be read properly into your program?

2.4 Specifying Your File Path

We will discuss how to specify file paths in Python for loading and saving data, with an emphasis on absolute and relative paths. File paths are crucial when working with files in Python, such as reading datasets and writing results to files.

2.4.1 Methods to Specify file path in Windows

Windows uses backslashes (\) to separate directories in file paths. However, in Python, the backslash is an escape character, so you need to either:

  • Escape backslashes by using a double backslash (\\), or
  • Use raw strings by prefixing the path with r.

2.4.1.1 Method 1: Using escaped backslashes

Code
file_path = "C:\\Users\\Username\\Documents\\data.csv"

2.4.1.2 Method 2: Using Raw Strings

Code
file_path = r"C:\Users\Username\Documents\data.csv"

2.4.1.3 Method 3: Using forward slashes (/)

Code
file_path = "C:/Users/Username/Documents/data.csv"

2.4.1.4 Method 4: Using os.path.join

Code
import os
file_path = os.path.join("C:", "Users", "Username", "Documents", "data.csv")

This method works across different operating systems because os.path.join automatically uses the correct path separator (\ for Windows and /for Linux/Mac).

macOS does not have the same issue as Windows when specifying file paths because macOS (like Linux) uses forward slashes (/) as the path separator, which is consistent with Python’s expectations.

2.4.1.5 Best Practices for File Paths in Data Science

  • Use relative paths when working in a project structure, as it allows the project to be portable.
  • Use absolute paths when working with external or shared files that aren’t part of your project.
  • Check the current working directory to ensure you are referencing files correctly.
  • Avoid hardcoding file paths directly in the code to make your code reusable on different machines.
  • Use the forward slash (/) as a path separator or os.path.join to specify file path if your code will work across different operating systems

2.5 Recognizing & Navigating to the Correct Working Directory in the Terminal (for generating html from .ipynb notebooks)

Before exporting a notebook to .html, you must make sure your terminal is in the folder that contains your .ipynb file.

2.5.1 Step 1: What is the Working Directory in the Terminal?

The working directory is where your terminal is currently located in your computer’s file system.

Run the following command in the terminal to print it out.

pwd

This tells you where your terminal is currently pointed.

2.5.2 Step 2: See What Files are in the current working directory

Run

ls

This will list the files/folders. Look for your .ipynb notebook.

If you see your notebook file here, you’re in the right directory!

If not, you need to navigate to the correct folder.

2.5.3 Step 3: Navigate to the Folder that Contains Your Notebook

Run

cd <folder-name>

  • Use cd .. to go up one folder

  • Use ls after cd to check that you’re in the right place

2.5.4 Step 4: Generate the HTML file

Once you are in the correct folder (containing your .ipynb file), run:

quarto render analysis.ipynb --to html

This will generate analysis.html in the same folder. Please replace the analysis.ipynb with your notebook name

2.5.5 🚨 Common Pitfall

Trying to export the notebook while the terminal is still in the wrong folder (like your home directory) will give you this error:

[NbConvertApp] WARNING | Notebook file 'analysis.ipynb' not found

✅ Fix: Use cd to go to the correct directory first!

2.5.6 📌 Summary Cheat Sheet

Command Purpose
pwd Show your current directory
ls List files in the current folder
cd folder/ Enter a folder
cd .. Go up one level
quarto render file.ipynb --to html Convert to HTML

2.6 Independent Study

To reinforce the skills learned in this lecture, complete the following tasks:

  1. Set Up Your Workspace
    • Create a folder named STAT201 for putting the course materials.
    • Create a python environment for the upcoming coursework.
    • Organize your files by creating separate directories for datasets, assignments, quizzes, lectures, and exams.
    • Set up these directories in your file system to keep your work structured and easy to navigate.
  2. Create your first notebook for quiz and generate html for your submission
    1. Create a new notebook named quiz2.ipynb inside the quizzes folder. Copy the header below and make sure to paste it into a raw cell.
    2. Download the filtered_movies.json file from the textbook dataset shared folder
    3. Add a heading to the top of the notebook: ## Reading Data From a File
    4. In a code cell, print the message: I am reading data from a csv file.
    5. Fix the file path below to make it work
    6. In another code cell, print the message: I successfully read the data into my python list.
    7. Save your file!
    8. Use Quarto to convert the notebook to an HTML file.
    9. Submit it to the quiz2
Code
# this is the header, copy the content between the """ """
"""
---
title: "Lecture 2: File path"
format: 
  html:
    toc: true
    toc-title: Contents
    code-fold: show
    self-contained: true
jupyter: python3
---
"""
'\n---\ntitle: "Lecture 2: File path"\nformat: \n  html:\n    toc: true\n    toc-title: Contents\n    code-fold: show\n    self-contained: true\njupyter: python3\n---\n'
import json
with open("Datasets/filtered_movies.json", encoding="utf8") as file:
    movie_data=json.load(file)
movie_data[:2]

By completing these exercises, you’ll gain practical experience with file paths, generating HTML from notebooks using Quarto, and interacting with the filesystem in Jupyter notebooks. This will prepare you for Python programming starting next week.