Programming with Python | Chapter 21: List Comprehensions and Generator Expressions

Programming with Python | Chapter 21: List Comprehensions and Generator Expressions

Chapter Objectives

  • Review the syntax and purpose of list comprehensions for creating lists concisely.
  • Review the syntax and purpose of generator expressions for creating generators lazily.
  • Incorporate conditional logic (if) within comprehensions and expressions to filter items.
  • Include nested logic (if/else) within the expression part of a comprehension.
  • Create nested list comprehensions to work with nested iterables.
  • Apply similar comprehension syntax to create dictionaries (dict comprehensions) and sets (set comprehensions).
  • Understand the benefits (conciseness, readability for simple cases) and potential drawbacks (readability for complex cases) of comprehensions.

Introduction

We’ve previously encountered list comprehensions as a concise way to create lists and generator expressions (Chapter 15) as a memory-efficient way to produce sequences. These constructs are fundamental tools for writing expressive and efficient Python code, often replacing longer for loop structures. This chapter revisits list comprehensions and generator expressions, exploring more advanced techniques such as adding conditional filtering, using conditional logic within the output expression, nesting comprehensions, and applying the same concise syntax to create dictionaries and sets. Mastering these techniques can significantly shorten your code and often make its intent clearer, especially for common data transformation and filtering tasks.

Theory & Explanation

List Comprehensions Review

A list comprehension provides a compact syntax for creating a new list based on the values of an existing iterable.

Basic Syntax:

Python
new_list = [expression for item in iterable]

This is roughly equivalent to:

Python
new_list = []
for item in iterable:
    new_list.append(expression)

Example: Squaring numbers

Python
numbers = [1, 2, 3, 4, 5]

# Using a loop
squares_loop = []
for num in numbers:
    squares_loop.append(num * num)

# Using a list comprehension
squares_comp = [num * num for num in numbers]

print(f"Loop result:    {squares_loop}") # Output: [1, 4, 9, 16, 25]
print(f"Comp. result:   {squares_comp}") # Output: [1, 4, 9, 16, 25]

graph TD
    subgraph Standard Loop
        L1["Initialize empty list <i>results</i>"] --> L2{"Loop: for x in numbers"};
        L2 -- Yes --> L3["Calculate <i>square = x*x</i>"];
        L3 --> L4["Append <i>square</i> to <i>results</i>"];
        L4 --> L2;
        L2 -- No --> L5[End Loop];
    end

    subgraph List Comprehension ["x*x for x in numbers"]
        C1{"Iterate: for x in numbers"} -- For each x --> C2["Evaluate <i>x*x</i>"];
        C2 --> C3("Collect results into new list");
        C1 -- When done --> C4["Final List Ready"];
    end

    style L1 fill:#f9f,stroke:#333,stroke-width:2px
    style C3 fill:#ccf,stroke:#333,stroke-width:2px

Generator Expressions Review

Generator expressions have a similar syntax to list comprehensions but use parentheses () instead of square brackets []. They create a generator object, which produces items lazily (one at a time) when iterated over, making them memory-efficient.

Basic Syntax:

Python
generator_object = (expression for item in iterable)

Example: Squaring numbers (lazily)

Python
numbers = [1, 2, 3, 4, 5]

squares_gen = (num * num for num in numbers)

print(squares_gen) # Output: <generator object <genexpr> at 0x...>

# Iterate to get values
for square in squares_gen:
    print(square, end=" ") # Output: 1 4 9 16 25
print()

Conditional Logic (if) for Filtering

You can add an if clause at the end of a comprehension or expression to filter items from the source iterable before the expression is applied. Only items for which the condition is True are included.

graph TD
    A{Iterate: for x in numbers} -- For each x --> B{"Condition: <i>if x % 2 == 0</i>?"};
    B -- Yes --> C["Evaluate <i>x*x</i>"];
    C --> D(Collect result into new list);
    B -- No --> A;  
    D --> A; 
    A -- When done --> E[Final Filtered List Ready];

    style B fill:#fcf,stroke:#333,stroke-width:1px
    style C fill:#ccf,stroke:#333,stroke-width:1px
    style D fill:#ccf,stroke:#333,stroke-width:2px

Syntax:

Python
filtered_list = [expression for item in iterable if condition]
filtered_gen = (expression for item in iterable if condition)

Example: Getting only even squares

Python
numbers = [1, 2, 3, 4, 5, 6]

# Using a loop
even_squares_loop = []
for num in numbers:
    if num % 2 == 0: # Condition check
        even_squares_loop.append(num * num)

# Using a list comprehension with 'if'
even_squares_comp = [num * num for num in numbers if num % 2 == 0]

# Using a generator expression with 'if'
even_squares_gen = (num * num for num in numbers if num % 2 == 0)

print(f"Loop result:  {even_squares_loop}") # Output: [4, 16, 36]
print(f"Comp. result: {even_squares_comp}") # Output: [4, 16, 36]
print(f"Gen. result:  {list(even_squares_gen)}") # Convert generator to list for printing: [4, 16, 36]

Conditional Expression (if/else) in the Output

You can use a conditional expression (ternary operator: value_if_true if condition else value_if_false) in the expression part (the beginning) of the comprehension to change the output based on a condition for each item. This is different from the filtering if clause.

graph TD
    X{Iterate: for x in numbers} -- For each x --> Y{"Condition: <i>if x % 2 == 0</i>?"};
    Y -- Yes --> Z1["Evaluate: <i><b>Even</b></i>"];
    Y -- No --> Z2["Evaluate: <i><b>Odd</b></i>"];
    Z1 --> W(Collect result into new list);
    Z2 --> W;
    W --> X; 
    X -- When done --> V[Final Transformed List Ready];

    style Y fill:#fcf,stroke:#333,stroke-width:1px
    style Z1 fill:#ccf,stroke:#333,stroke-width:1px
    style Z2 fill:#ccf,stroke:#333,stroke-width:1px
    style W fill:#ccf,stroke:#333,stroke-width:2px

Syntax:

Python
result_list = [value_if_true if condition else value_if_false for item in iterable]

Example: Label numbers as ‘Even’ or ‘Odd’

Python
numbers = [1, 2, 3, 4, 5]

# Using a loop
labels_loop = []
for num in numbers:
    if num % 2 == 0:
        labels_loop.append("Even")
    else:
        labels_loop.append("Odd")

# Using a list comprehension with conditional expression
labels_comp = ["Even" if num % 2 == 0 else "Odd" for num in numbers]

print(f"Loop result: {labels_loop}") # Output: ['Odd', 'Even', 'Odd', 'Even', 'Odd']
print(f"Comp. result: {labels_comp}") # Output: ['Odd', 'Even', 'Odd', 'Even', 'Odd']

Note: While you can combine both a conditional expression in the output and a filtering if clause, it can quickly become hard to read.

graph TD
    subgraph "List Comprehension [...]"
        LC1[Start Processing Iterable] --> LC2[Process Item 1];
        LC2 --> LC3[Process Item 2];
        LC3 --> LC4[...]
        LC4 --> LC5[Process Last Item];
        LC5 --> LC6(Entire List Created in Memory);
    end

    subgraph "Generator Expression (...)"
        GE1[Define Generator] --> GE2{"Iteration Request (<br>e.g., <i>next()</i>, <i>for</i> loop)"};
        GE2 -- Request --> GE3[Process *one* item];
        GE3 --> GE4(Yield Item);
        GE4 --> GE2;
        GE2 -- StopIteration --> GE5[Generator Exhausted];
    end

    style LC6 fill:#ccf,stroke:#333,stroke-width:2px
    style GE4 fill:#cfc,stroke:#333,stroke-width:2px

Nested Comprehensions

List comprehensions can be nested to work with nested iterables, like lists of lists (matrices).

Syntax (Example for flattening a matrix):

Python
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flattened = [element for row in matrix for element in row]
# Equivalent loop:
# flattened_loop = []
# for row in matrix:
#     for element in row:
#         flattened_loop.append(element)
print(flattened) # Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]

The for clauses are nested in the same order as they would appear in equivalent nested loops. Be cautious, as deeply nested comprehensions can become difficult to understand.

Dictionary Comprehensions

Similar syntax using curly braces {} can create dictionaries. You need to provide both a key and a value expression, separated by a colon :.

Syntax:

Python
new_dict = {key_expression: value_expression for item in iterable if condition}

Example: Creating a dictionary of numbers and their squares

Python
numbers = [1, 2, 3, 4, 5]

# Using a loop
squares_dict_loop = {}
for num in numbers:
    squares_dict_loop[num] = num * num

# Using a dictionary comprehension
squares_dict_comp = {num: num * num for num in numbers}

print(f"Loop result: {squares_dict_loop}") # Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
print(f"Comp. result: {squares_dict_comp}") # Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# Example with condition: Squares of even numbers only
even_squares_dict = {num: num * num for num in numbers if num % 2 == 0}
print(f"Even squares dict: {even_squares_dict}") # Output: {2: 4, 4: 16}

Set Comprehensions

Set comprehensions also use curly braces {} but only have a single expression (like list comprehensions). They automatically handle uniqueness.

Syntax:

Python
new_set = {expression for item in iterable if condition}

Example: Creating a set of unique squares of numbers (including duplicates in input)

Python
numbers = [1, 2, 2, 3, 3, 3, 4, 5]

# Using a loop
squares_set_loop = set()
for num in numbers:
    squares_set_loop.add(num * num)

# Using a set comprehension
squares_set_comp = {num * num for num in numbers}

print(f"Loop result: {squares_set_loop}") # Output: {1, 4, 9, 16, 25} (order may vary)
print(f"Comp. result: {squares_set_comp}") # Output: {1, 4, 9, 16, 25} (order may vary)

# Example with condition: Unique uppercase first letters
words = ["apple", "banana", "apricot", "blueberry", "cherry"]
first_letters_set = {word[0].upper() for word in words if len(word) > 0}
print(f"Unique first letters: {first_letters_set}") # Output: {'C', 'B', 'A'} (order may vary)

Code Examples

Animation: List Comprehension with Filtering

[num * num for num in numbers if num % 2 == 0]
Input `numbers`:
Click Start to begin…
Output List:

Example 1: Filtering and Transforming Data

Python
# data_processing_comp.py

data = [
    {"name": "Alice", "score": 85},
    {"name": "Bob", "score": 55},
    {"name": "Charlie", "score": 92},
    {"name": "David", "score": 70},
]

# Get names of students who passed (score >= 60) using list comprehension
passing_names = [student["name"] for student in data if student["score"] >= 60]
print(f"Passing student names: {passing_names}")
# Output: ['Alice', 'Charlie', 'David']

# Create a dictionary mapping names to 'Pass'/'Fail' status
status_dict = {
    student["name"]: ("Pass" if student["score"] >= 60 else "Fail")
    for student in data
}
print(f"Student statuses: {status_dict}")
# Output: {'Alice': 'Pass', 'Bob': 'Fail', 'Charlie': 'Pass', 'David': 'Pass'}

# Create a set of unique scores using set comprehension
unique_scores = {student["score"] for student in data}
print(f"Unique scores: {unique_scores}")
# Output: {85, 55, 92, 70} (order may vary)

Explanation:

  • Demonstrates filtering (if student["score"] >= 60) in a list comprehension.
  • Shows using a conditional expression ("Pass" if ... else "Fail") in a dictionary comprehension.
  • Uses a set comprehension to easily extract unique values.

Example 2: Nested List Comprehension (Matrix Transpose)

Python
# matrix_transpose.py

matrix = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
]

# Transpose using nested loops
transposed_loop = []
num_rows = len(matrix)
num_cols = len(matrix[0]) # Assume matrix is not empty and rectangular
for j in range(num_cols):
    new_row = []
    for i in range(num_rows):
        new_row.append(matrix[i][j])
    transposed_loop.append(new_row)

# Transpose using nested list comprehension
transposed_comp = [[matrix[i][j] for i in range(num_rows)] for j in range(num_cols)]

print("Original Matrix:")
for row in matrix: print(row)

print("\nTransposed (Loop):")
for row in transposed_loop: print(row)
# Output:
# [1, 4, 7]
# [2, 5, 8]
# [3, 6, 9]

print("\nTransposed (Comprehension):")
for row in transposed_comp: print(row)
# Output:
# [1, 4, 7]
# [2, 5, 8]
# [3, 6, 9]

Explanation:

  • Transposing a matrix involves swapping rows and columns.
  • The nested list comprehension [[matrix[i][j] for i in range(num_rows)] for j in range(num_cols)] achieves this concisely.
  • The outer loop iterates through the columns (j), and the inner comprehension builds the new row by picking elements from each original row (i) at that column index j.

Example 3: Generator Expression for Large File Processing (Revisited)

Python
# large_file_genexp.py
import os

# Create a dummy large file (conceptual)
filename = "large_data.log"
try:
    with open(filename, "w") as f:
        for i in range(10000): # Simulate 10000 lines
            f.write(f"LINE {i}: Data value {i*i % 137}\n")
        f.write("ERROR: Critical failure detected on line 10001\n")
        for i in range(10000, 15000):
             f.write(f"LINE {i}: Data value {i*i % 101}\n")
except IOError:
    print("Error creating dummy file.")


# Process the file using a generator expression to find lines with 'LINE 12'
# without loading the whole file into memory.
try:
    with open(filename, "r") as f:
        # Generator expression - processes lazily
        matching_lines = (line.strip() for line in f if "LINE 12" in line)

        print(f"--- Lines containing 'LINE 12' (from {filename}) ---")
        count = 0
        for line in matching_lines: # Pulls lines from generator
            print(line)
            count += 1
            if count >= 5: # Limit output for demonstration
                print("... (limiting output)")
                break
        if count == 0:
            print("No matching lines found.")

except FileNotFoundError:
    print(f"Error: File '{filename}' not found.")
finally:
    if os.path.exists(filename):
        # os.remove(filename) # Uncomment to clean up
        pass

Explanation:

  • This example emphasizes the memory efficiency of generator expressions for tasks like searching large files.
  • The generator (line.strip() for line in f if "LINE 12" in line) only reads and processes lines as needed by the for loop, avoiding loading the entire (potentially huge) file content.

Common Mistakes or Pitfalls

  • Readability: While powerful, overly complex or deeply nested comprehensions can become very difficult to read and understand compared to a well-structured for loop. Prioritize clarity.
  • Side Effects: List comprehensions are primarily for creating new lists based on existing iterables. Avoid including expressions with significant side effects (like printing or modifying external state) within a comprehension, as it can be confusing. Use regular for loops for tasks dominated by side effects.
  • Generator Exhaustion: Forgetting that generator expressions (like all generators) are exhausted after one full iteration.
  • Syntax Errors: Mixing up brackets [] (list comp), parentheses () (gen exp), and curly braces {} (set/dict comp), or getting the order of expression, for, and if clauses wrong.

Chapter Summary

Type Basic Syntax Output Key Feature Example (Conceptual)
List Comprehension [expression for item in iterable] List [] Creates a new list in memory immediately (eager evaluation). [x*x for x in range(5)] -> [0, 1, 4, 9, 16]
Generator Expression (expression for item in iterable) Generator object Produces items one by one on demand (lazy evaluation), memory efficient for large sequences. (x*x for x in range(5)) -> <generator object>
Dictionary Comprehension {key_expr: val_expr for item in iterable} Dictionary {} Creates a new dictionary. Requires key and value expressions. {x: x*x for x in range(3)} -> {0: 0, 1: 1, 2: 4}
Set Comprehension {expression for item in iterable} Set {} Creates a new set, automatically handling uniqueness. {x % 2 for x in [1, 2, 1, 3]} -> {0, 1}
Filtering (Applies to all) [... for item in iterable if condition]
(... for item in iterable if condition)
{... for item in iterable if condition}
Filtered collection Includes only items from the iterable where the if condition is true. [x for x in range(5) if x > 2] -> [3, 4]
Conditional Expression (Applies to all) [val_true if cond else val_false for item in iterable]
(... for item in iterable)
{... for item in iterable}
Transformed collection Applies different expressions based on a condition for each item. Affects the output value. ['even' if x%2==0 else 'odd' for x in range(3)] -> ['even', 'odd', 'even']
Nested Comprehension (List example) [elem for row in matrix for elem in row] List (often flattened) Processes nested iterables. Order matches nested loops. [i for r in [[1],[2,3]] for i in r] -> [1, 2, 3]
  • List comprehensions ([expr for item in iterable if cond]) provide a concise way to create lists.
  • Generator expressions ((expr for item in iterable if cond)) create generators, offering memory efficiency through lazy evaluation.
  • Conditional filtering (if cond) can be added to include only certain items from the iterable.
  • Conditional expressions (val_true if cond else val_false) can be used in the expr part to transform items differently.
  • Nested comprehensions allow processing of nested iterables, mirroring nested loops.
  • Dictionary comprehensions ({key_expr: val_expr for ...}) create dictionaries.
  • Set comprehensions ({expr for ...}) create sets, automatically handling uniqueness.
  • Comprehensions and expressions enhance code conciseness but should be used judiciously to maintain readability.

Exercises & Mini Projects

Exercises

  1. Squares of Evens: Use a list comprehension to create a list containing the squares of all even numbers from 0 to 20 (inclusive).
  2. Word Lengths: Given a sentence string (e.g., “This is a sample sentence”), use a dictionary comprehension to create a dictionary where keys are the words and values are the lengths of those words.
  3. Unique Vowels: Given a string, use a set comprehension to create a set containing all the unique vowels (a, e, i, o, u, case-insensitive) present in the string.
  4. Conditional Labeling: Use a list comprehension and a conditional expression to create a list from range(10) where numbers less than 5 are labeled “small” and numbers 5 or greater are labeled “large”.
  5. Flatten List: Given a list of lists nested = [[1, 2], [3, 4, 5], [6]], use a nested list comprehension to create a single flattened list [1, 2, 3, 4, 5, 6].

Mini Project: Refactoring with Comprehensions

Goal: Take a previous exercise or mini-project that used loops to create a list, dictionary, or set, and refactor it to use the corresponding comprehension or expression.

Choose one of the following (or similar code you’ve written):

  • Word Counter (Chapter 7): Refactor the loop that builds the word_counts dictionary to use a dictionary comprehension (this might be slightly tricky due to needing the counts; consider using collections.Counter first, then a comprehension if desired, or stick to the original loop if the comprehension becomes too complex).
  • Event Reminder (Chapter 20): Refactor the part where you process the events list. Can you use a list comprehension or generator expression to create tuples or objects containing the event name and the calculated days remaining?
  • Any Exercise Creating a List: Find an exercise from earlier chapters (e.g., calculating factorials into a list, filtering numbers) where you used a for loop and .append() to build a list. Rewrite that logic using a list comprehension.

Steps:

  1. Identify the code block with the for loop that populates a list, dictionary, or set.
  2. Analyze the logic: What is the source iterable? What transformation is applied to each item? Is there any filtering condition?
  3. Rewrite the logic using the appropriate comprehension syntax (list, dict, set, or generator expression).
  4. Ensure the refactored code produces the same result as the original loop-based code.
  5. Compare the readability of the two versions. Is the comprehension clearer in this case?

Additional Sources:

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top