Programming with Python | Chapter 10: File Handling – Reading and Writing Files

Chapter Objectives

  • Understand the basics of file systems and file paths (absolute vs. relative).
  • Learn how to open files using the built-in open() function.
  • Understand different file modes ('r', 'w', 'a', 'b', '+') for reading, writing, and appending.
  • Read data from files using methods like .read(), .readline(), and .readlines().
  • Write data to files using .write() and .writelines().
  • Understand the importance of closing files using .close() or preferably using the with statement for automatic closing.
  • Handle potential errors during file operations (e.g., FileNotFoundError).
  • Perform basic file path manipulations using the os module.

Introduction

Many applications need to interact with files stored on the computer’s file system – reading configuration data, processing text documents, saving user progress, logging events, and much more. This chapter introduces file handling in Python. You will learn how to open files in different modes (for reading, writing, or appending), how to read content from files line by line or all at once, how to write new data to files, and the crucial importance of properly closing files after use, preferably using the with statement.

Theory & Explanation

File Paths

To work with a file, Python needs to know its location, specified by a file path.

  • Absolute Path: Specifies the location from the root directory of the file system (e.g., C:\Users\Alice\Documents\report.txt on Windows, /home/alice/documents/report.txt on Linux/macOS). Absolute paths are unambiguous but less portable.
  • Relative Path: Specifies the location relative to the current working directory (the directory where your Python script is being run from). (e.g., data/report.txt, ../images/logo.png). Relative paths are more portable as they don’t depend on the specific root structure.

Python typically uses forward slashes / for paths, even on Windows where backslashes \ are common (Python often handles the conversion). Using raw strings (e.g., r"C:\Users\name") or double backslashes (e.g., "C:\\Users\\name") is necessary if using backslashes in string literals on Windows.

Opening Files: open()

The core function for working with files is open(). It returns a file object (also called a file handle), which provides methods for interacting with the file.

Syntax:

Python
file_object = open(file_path, mode='r', encoding=None)
  • file_path: A string representing the path to the file.
  • mode: A string indicating how the file should be opened. Defaults to 'r' (read text).
  • encoding: The encoding used to decode/encode the file (e.g., 'utf-8', 'ascii'). It’s crucial for text files, especially when working across different systems. utf-8 is a very common and recommended encoding. If omitted, Python uses a platform-dependent default, which can lead to issues.

Common File Modes:

Mode Description Behavior if File Exists Behavior if File Doesn’t Exist
'r' Read (text mode). Default. Reads from start. Pointer at beginning. Raises FileNotFoundError.
'w' Write (text mode). Truncates file first. Overwrites (empties) the file. Pointer at beginning. Creates new file.
'a' Append (text mode). Writes to the end. Appends to end. Pointer at end. Creates new file.
'r+' Read and Write (text mode). Reads from start. Pointer at beginning. Can write anywhere. Raises FileNotFoundError.
'w+' Write and Read (text mode). Truncates first. Overwrites (empties) the file. Pointer at beginning. Creates new file.
'a+' Append and Read (text mode). Writes to end. Appends to end. Pointer initially at end for writing, but can be moved for reading. Creates new file.
'rb' Read Binary mode. Reads from start as bytes. Pointer at beginning. Raises FileNotFoundError.
'wb' Write Binary mode. Truncates first. Overwrites (empties) the file with bytes. Pointer at beginning. Creates new file.
'ab' Append Binary mode. Writes bytes to end. Appends bytes to end. Pointer at end. Creates new file.
'r+b' Read and Write Binary mode. Reads from start as bytes. Pointer at beginning. Can write bytes anywhere. Raises FileNotFoundError.
'w+b' Write and Read Binary mode. Truncates first. Overwrites (empties) the file with bytes. Pointer at beginning. Creates new file.
'a+b' Append and Read Binary mode. Writes bytes to end. Appends bytes to end. Pointer initially at end for writing, but can be moved for reading bytes. Creates new file.

Binary mode ('b') is used for non-text files like images, audio, executables, etc., where data is handled as raw bytes.

File Modes: r, w, a Mode ‘r’ (Read) Line 1 Line 2 Pointer at start. Reads existing content. Error if file not found. Mode ‘w’ (Write) (Content cleared) Truncates file if exists. Creates file if not exists. Pointer at start for writing. Mode ‘a’ (Append) Line 1 Line 2 Pointer at end. Writes add to end of file. Creates file if not exists.

Reading from Files

Once a file is opened in a read mode ('r', 'r+', 'rb', etc.), you can use these methods on the file object:

  • .read(size=-1): Reads and returns up to size bytes/characters from the file. If size is omitted or negative, reads and returns the entire file content as a single string (or bytes object in binary mode). Be cautious with large files, as this loads everything into memory.
  • .readline(): Reads and returns a single line from the file, including the newline character (\n) at the end. Returns an empty string ('') when the end of the file (EOF) is reached.
  • .readlines(): Reads all remaining lines from the file and returns them as a list of strings, where each string is a line including its newline character. Can consume a lot of memory for large files.

Iterating over a file object is the most memory-efficient way to read a file line by line:

Python
# Assume 'my_file.txt' exists and is opened as 'f'
for line in f:
    # Process each line (line includes the trailing '\n')
    print(line.strip()) # .strip() removes leading/trailing whitespace, including '\n'

Writing to Files

To write to a file, open it in a write or append mode ('w', 'a', 'w+', 'a+', 'wb', etc.).

  • .write(string): Writes the given string (or bytes in binary mode) to the file. It does not automatically add a newline character. Returns the number of characters/bytes written.
  • .writelines(list_of_strings): Writes each string from the list_of_strings to the file. It does not add newline characters between strings; you need to include them in the strings themselves if desired.
Python
# Example: Writing lines with newlines explicitly added
lines_to_write = ["First line.\n", "Second line.\n", "Third line.\n"]

# Assume file opened in 'w' mode as 'f'
f.write("This is the first line manually written.\n")
f.writelines(lines_to_write)

Closing Files: .close() and the with Statement

When you are finished with a file, it’s crucial to close it using the .close() method on the file object. Closing releases the system resources associated with the file and ensures that all buffered data is written (flushed) to the disk.

Python
f = open("example.txt", "w", encoding="utf-8")
try:
    f.write("Important data.")
finally:
    f.close() # Ensure file is closed even if errors occur

Forgetting to close files can lead to data loss or corruption, resource leaks, or hitting system limits on open files.

The with Statement (Recommended):

Manually calling .close() can be error-prone, especially if exceptions occur before .close() is reached. The with statement provides a much safer and more convenient way to work with files (and other resources that need cleanup). It automatically closes the file when the with block is exited, even if errors occur.

graph TD
    A[Start] --> B{Open File using 'with'};
    B -- Success --> C[Perform Read/Write Operations within 'with' block];
    B -- Failure (e.g., FileNotFoundError) --> E{Handle Exception};
    C --> D[Exit 'with' block];
    D --> F[File Automatically Closed];
    E --> G["End (Error Handled)"];
    F --> H["End (Success)"];

    style B fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px
    style F fill:#D1FAE5,stroke:#065F46,stroke-width:2px
    style E fill:#FEE2E2,stroke:#991B1B,stroke-width:2px

Syntax:

Python
with open(file_path, mode='r', encoding='utf-8') as file_variable:
    # Work with the file using 'file_variable' inside this indented block
    content = file_variable.read()
    # ... other operations ...
# File is automatically closed here, outside the 'with' block

This is the preferred way to handle files in Python.

Working with File Paths (os module)

The os module (part of the standard library) provides functions for interacting with the operating system, including file path manipulation.

Python
import os

# Get current working directory
cwd = os.getcwd()
print(f"Current Directory: {cwd}")

# Join path components intelligently (handles / or \ correctly)
file_name = "report.txt"
data_dir = "data"
full_path = os.path.join(cwd, data_dir, file_name)
print(f"Full path: {full_path}")

# Check if a path exists
if os.path.exists(full_path):
    print("File exists.")
else:
    print("File does not exist.")

# Check if it's a file or directory
print(f"Is it a file? {os.path.isfile(full_path)}")
print(f"Is it a directory? {os.path.isdir(os.path.join(cwd, data_dir))}")

graph LR
    subgraph Input Components
        P1["/home/user"]
        P2["data"]
        P3["report.txt"]
    end

    subgraph os.path.join
        J["os.path.join(P1, P2, P3)"]
    end

    subgraph Output Path
        O["/home/user/data/report.txt"]
    end

    P1 --> J;
    P2 --> J;
    P3 --> J;
    J --> O;

    style J fill:#FEF3C7,stroke:#92400E,stroke-width:2px

Code Examples

Example 1: Reading a File Line by Line using with

JS Animation: Reading File Lines

Animation: Reading File Line by Line

Roses are red,
Violets are blue,
Python is fun,
And so are you!
Python
# read_poem.py

file_path = "poem.txt" # Assume this file exists in the same directory

# Create a dummy poem.txt file for the example
try:
    with open(file_path, "w", encoding="utf-8") as f:
        f.write("Roses are red,\n")
        f.write("Violets are blue,\n")
        f.write("Python is fun,\n")
        f.write("And so are you!\n")
except IOError as e:
    print(f"Error creating dummy file: {e}")

print(f"--- Reading '{file_path}' line by line ---")
line_number = 0
try:
    # Use 'with' for safe file handling
    with open(file_path, "r", encoding="utf-8") as poem_file:
        for line in poem_file: # Efficiently iterates line by line
            line_number += 1
            # .strip() removes leading/trailing whitespace, including the newline
            print(f"Line {line_number}: {line.strip()}")

except FileNotFoundError:
    print(f"Error: The file '{file_path}' was not found.")
except IOError as e:
    print(f"Error reading file: {e}")

Explanation:

  • A dummy poem.txt is created first using with open(...) in write mode ('w').
  • The main reading part uses with open(...) in read mode ('r') with UTF-8 encoding.
  • The for line in poem_file: loop iterates through the file object directly, reading one line at a time efficiently.
  • .strip() is used to remove the trailing newline character (\n) from each line before printing.
  • try...except blocks handle potential FileNotFoundError if poem.txt doesn’t exist or other IOError during reading. The with statement ensures the file is closed even if an error occurs within the block.

Example 2: Writing to a File using with

Python
# write_data.py

output_file = "user_data.csv"
users = [
    {"id": 1, "name": "Alice", "email": "alice@example.com"},
    {"id": 2, "name": "Bob", "email": "bob@sample.org"},
    {"id": 3, "name": "Charlie", "email": "charlie@test.net"},
]

try:
    with open(output_file, "w", encoding="utf-8") as f:
        # Write header row
        f.write("UserID,UserName,UserEmail\n")

        # Write data rows
        for user in users:
            # Format data as a CSV line
            line = f"{user['id']},{user['name']},{user['email']}\n"
            f.write(line) # Write the formatted line

    print(f"Data successfully written to '{output_file}'")

except IOError as e:
    print(f"Error writing to file '{output_file}': {e}")

Explanation:

  • Opens user_data.csv in write mode ('w') using with. If the file exists, it will be overwritten.
  • Writes a header line first, making sure to include the newline \n.
  • Iterates through the users list (list of dictionaries).
  • Inside the loop, formats each user’s data into a comma-separated string with a newline character.
  • Uses f.write() to write each formatted line to the file.
  • A try...except block handles potential IOError during writing. The with statement ensures closure.

Example 3: Appending to a File

Python
# append_log.py
import datetime
import os

log_file = "activity.log"

def log_event(message):
    """Appends a timestamped message to the log file."""
    now = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    log_entry = f"[{now}] {message}\n"
    try:
        # Use 'a' mode for appending
        with open(log_file, "a", encoding="utf-8") as f:
            f.write(log_entry)
    except IOError as e:
        print(f"Error writing to log file '{log_file}': {e}")

# --- Log some events ---
print(f"Logging events to '{os.path.abspath(log_file)}'") # Show full path
log_event("Application started.")
log_event("User 'admin' logged in.")
log_event("Data processing initiated.")
log_event("Application finished.")
print("Logging complete.")

Explanation:

  • The log_event function takes a message.
  • It gets the current timestamp.
  • It opens the log_file in append mode ('a') using with. If the file doesn’t exist, it will be created. If it does exist, new data will be added to the end.
  • It writes the formatted log entry (timestamp + message + newline) to the file.
  • The main part calls log_event multiple times, and each call adds a new line to the end of activity.log.

Common Mistakes or Pitfalls

  • FileNotFoundError: Trying to open a file for reading ('r') that doesn’t exist or providing an incorrect path.
  • Overwriting Files: Opening a file in write mode ('w') will erase its contents if it already exists. Be sure this is the intended behavior; use append mode ('a') if you want to add to the end.
  • Forgetting .close() (without with): Not closing a file can lead to data loss or resource issues. Always use the with statement.
  • Incorrect Mode: Trying to write to a file opened in read-only mode ('r') or read from a file opened in write-only mode ('w').
  • Missing Newlines: Forgetting to add \n when using .write() if you intend for subsequent writes to start on a new line. .writelines() also doesn’t add newlines automatically.
  • Encoding Issues: Reading or writing text files using the wrong encoding (or relying on the platform default) can lead to UnicodeDecodeError, UnicodeEncodeError, or Mojibake (garbled text). Specify encoding='utf-8' unless you know the file uses a different specific encoding.
  • Reading Large Files into Memory: Using .read() or .readlines() on very large files can exhaust system memory. Iterate over the file object (for line in file:) for line-by-line processing of large files.

Chapter Summary

Concept Syntax / Example Description
File Paths "data/report.txt" (Relative)
r"C:\Users\doc.txt" (Absolute)
Specifies file location. Relative paths are more portable, absolute paths are unambiguous. Use os.path.join() for cross-platform compatibility.
Opening File f = open("file.txt", "r", encoding="utf-8") Returns a file object. Requires path, mode (default ‘r’), and optionally encoding (utf-8 recommended for text).
with Statement with open(...) as f:
  # work with f
Recommended way to open files. Automatically closes the file when the block is exited, even if errors occur. Ensures resource cleanup.
File Modes (Text) 'r', 'w', 'a', 'r+', 'w+', 'a+' r: Read (default). Error if not exists.
w: Write (truncates). Creates if not exists.
a: Append. Creates if not exists.
+: Adds read/write capability to the primary mode.
File Modes (Binary) 'rb', 'wb', 'ab', etc. Append b to the mode for handling non-text files (images, executables) as raw bytes.
Reading (All) content = f.read() Reads the entire file content into a single string (or bytes). Caution with large files (memory usage).
Reading (Line) line = f.readline() Reads a single line, including the trailing newline \n. Returns empty string at EOF.
Reading (Lines) lines = f.readlines() Reads all remaining lines into a list of strings. Caution with large files (memory usage).
Reading (Iterating) for line in f:
  print(line.strip())
Most memory-efficient way to read line by line. Iterates directly over the file object. Use .strip() to remove newline characters.
Writing (String) f.write("Hello\n") Writes a string (or bytes) to the file. Does *not* automatically add newlines. Returns number of characters/bytes written.
Writing (Lines) f.writelines(["Line1\n", "Line2\n"]) Writes each string from a list/iterable to the file. Does *not* add newlines between items.
Closing File f.close() Releases system resources and flushes buffers. Crucial if not using with. Using with is preferred as it handles closing automatically.
Error Handling try...except FileNotFoundError:
  # Handle error
Use try...except blocks to catch errors like FileNotFoundError or IOError during file operations.
os Module os.path.join('dir', 'file')
os.path.exists(path)
os.getcwd()
Provides OS-level functions, including portable path joining, checking existence, getting current directory, etc.
  • Files are accessed using their paths (absolute or relative).
  • Use open(path, mode, encoding) to get a file object. Specify the correct mode ('r', 'w', 'a', 'b', '+') and encoding (e.g., 'utf-8').
  • Read using .read(), .readline(), .readlines(), or by iterating directly over the file object (preferred for line-by-line).
  • Write using .write(string) or .writelines(list_of_strings). Remember to add \n manually if needed.
  • Crucially, close files after use. The with open(...) as ...: statement is the recommended way as it guarantees closure even if errors occur.
  • Handle potential errors like FileNotFoundError and IOError using try...except blocks.
  • The os module (e.g., os.path.join, os.path.exists) helps work with file paths portably.

Exercises & Mini Projects

Exercises

  1. Read and Print: Create a text file named info.txt with a few lines about Python. Write a Python script that opens this file, reads its entire content using .read(), and prints the content to the console. Use the with statement.
  2. Line-by-Line Reading: Using the info.txt file from Exercise 1, write a script that reads the file line by line using a for loop and prints each line prefixed with its line number (e.g., “1: First line text”).
  3. Write to File: Write a script that asks the user for their name using input() and then writes “Hello, [User’s Name]!” to a file named greeting.txt. Use the with statement and write mode ('w').
  4. Append to File: Write a script that opens the greeting.txt file created in Exercise 3 in append mode ('a'). Write a new line “Glad to meet you!” to the end of the file. Verify the file contents afterwards.
  5. File Exists Check: Write a script that checks if the file info.txt exists using os.path.exists(). Print an appropriate message whether it exists or not.

Mini Project: Simple File Copier

Goal: Create a script that copies the content of one text file to another.

Steps:

  1. Create a source text file named source.txt and put some lines of text in it.
  2. Write a Python script file_copier.py.
  3. Inside the script:
    • Define variables for the source file name (source_file = "source.txt") and the destination file name (dest_file = "destination.txt").
    • Use nested with statements (or two separate with blocks, though nested is common here):
      • Open the source_file in read mode ('r').
      • Open the dest_file in write mode ('w').
    • Inside the with blocks:
      • Read the entire content of the source file using .read().
      • Write the retrieved content to the destination file using .write().
    • Add try...except blocks around the file operations to handle potential FileNotFoundError (if source.txt doesn’t exist) or IOError. Print informative error messages if exceptions occur.
    • If the copy is successful (no exceptions), print a success message like “Successfully copied ‘source.txt’ to ‘destination.txt'”.

(Optional Enhancement): Instead of reading the whole file at once with .read(), modify the script to read the source file line by line using a for loop and write each line to the destination file. This is more memory-efficient for large files.

Additional Sources:

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top