Programming with Python | Chapter 21: List Comprehensions and Generator Expressions
Chapter Objectives
- Review the syntax and purpose of list comprehensions for creating lists concisely.
- Review the syntax and purpose of generator expressions for creating generators lazily.
- Incorporate conditional logic (
if
) within comprehensions and expressions to filter items. - Include nested logic (
if/else
) within the expression part of a comprehension. - Create nested list comprehensions to work with nested iterables.
- Apply similar comprehension syntax to create dictionaries (
dict
comprehensions) and sets (set
comprehensions). - Understand the benefits (conciseness, readability for simple cases) and potential drawbacks (readability for complex cases) of comprehensions.
Introduction
We’ve previously encountered list comprehensions as a concise way to create lists and generator expressions (Chapter 15) as a memory-efficient way to produce sequences. These constructs are fundamental tools for writing expressive and efficient Python code, often replacing longer for
loop structures. This chapter revisits list comprehensions and generator expressions, exploring more advanced techniques such as adding conditional filtering, using conditional logic within the output expression, nesting comprehensions, and applying the same concise syntax to create dictionaries and sets. Mastering these techniques can significantly shorten your code and often make its intent clearer, especially for common data transformation and filtering tasks.
Theory & Explanation
List Comprehensions Review
A list comprehension provides a compact syntax for creating a new list based on the values of an existing iterable.
Basic Syntax:
new_list = [expression for item in iterable]
This is roughly equivalent to:
new_list = []
for item in iterable:
new_list.append(expression)
Example: Squaring numbers
numbers = [1, 2, 3, 4, 5]
# Using a loop
squares_loop = []
for num in numbers:
squares_loop.append(num * num)
# Using a list comprehension
squares_comp = [num * num for num in numbers]
print(f"Loop result: {squares_loop}") # Output: [1, 4, 9, 16, 25]
print(f"Comp. result: {squares_comp}") # Output: [1, 4, 9, 16, 25]
graph TD subgraph Standard Loop L1["Initialize empty list <i>results</i>"] --> L2{"Loop: for x in numbers"}; L2 -- Yes --> L3["Calculate <i>square = x*x</i>"]; L3 --> L4["Append <i>square</i> to <i>results</i>"]; L4 --> L2; L2 -- No --> L5[End Loop]; end subgraph List Comprehension ["x*x for x in numbers"] C1{"Iterate: for x in numbers"} -- For each x --> C2["Evaluate <i>x*x</i>"]; C2 --> C3("Collect results into new list"); C1 -- When done --> C4["Final List Ready"]; end style L1 fill:#f9f,stroke:#333,stroke-width:2px style C3 fill:#ccf,stroke:#333,stroke-width:2px
Generator Expressions Review
Generator expressions have a similar syntax to list comprehensions but use parentheses ()
instead of square brackets []
. They create a generator object, which produces items lazily (one at a time) when iterated over, making them memory-efficient.
Basic Syntax:
generator_object = (expression for item in iterable)
Example: Squaring numbers (lazily)
numbers = [1, 2, 3, 4, 5]
squares_gen = (num * num for num in numbers)
print(squares_gen) # Output: <generator object <genexpr> at 0x...>
# Iterate to get values
for square in squares_gen:
print(square, end=" ") # Output: 1 4 9 16 25
print()
Conditional Logic (if
) for Filtering
You can add an if
clause at the end of a comprehension or expression to filter items from the source iterable before the expression is applied. Only items for which the condition is True
are included.
graph TD A{Iterate: for x in numbers} -- For each x --> B{"Condition: <i>if x % 2 == 0</i>?"}; B -- Yes --> C["Evaluate <i>x*x</i>"]; C --> D(Collect result into new list); B -- No --> A; D --> A; A -- When done --> E[Final Filtered List Ready]; style B fill:#fcf,stroke:#333,stroke-width:1px style C fill:#ccf,stroke:#333,stroke-width:1px style D fill:#ccf,stroke:#333,stroke-width:2px
Syntax:
filtered_list = [expression for item in iterable if condition]
filtered_gen = (expression for item in iterable if condition)
Example: Getting only even squares
numbers = [1, 2, 3, 4, 5, 6]
# Using a loop
even_squares_loop = []
for num in numbers:
if num % 2 == 0: # Condition check
even_squares_loop.append(num * num)
# Using a list comprehension with 'if'
even_squares_comp = [num * num for num in numbers if num % 2 == 0]
# Using a generator expression with 'if'
even_squares_gen = (num * num for num in numbers if num % 2 == 0)
print(f"Loop result: {even_squares_loop}") # Output: [4, 16, 36]
print(f"Comp. result: {even_squares_comp}") # Output: [4, 16, 36]
print(f"Gen. result: {list(even_squares_gen)}") # Convert generator to list for printing: [4, 16, 36]
Conditional Expression (if/else
) in the Output
You can use a conditional expression (ternary operator: value_if_true if condition else value_if_false
) in the expression part (the beginning) of the comprehension to change the output based on a condition for each item. This is different from the filtering if
clause.
graph TD X{Iterate: for x in numbers} -- For each x --> Y{"Condition: <i>if x % 2 == 0</i>?"}; Y -- Yes --> Z1["Evaluate: <i><b>Even</b></i>"]; Y -- No --> Z2["Evaluate: <i><b>Odd</b></i>"]; Z1 --> W(Collect result into new list); Z2 --> W; W --> X; X -- When done --> V[Final Transformed List Ready]; style Y fill:#fcf,stroke:#333,stroke-width:1px style Z1 fill:#ccf,stroke:#333,stroke-width:1px style Z2 fill:#ccf,stroke:#333,stroke-width:1px style W fill:#ccf,stroke:#333,stroke-width:2px
Syntax:
result_list = [value_if_true if condition else value_if_false for item in iterable]
Example: Label numbers as ‘Even’ or ‘Odd’
numbers = [1, 2, 3, 4, 5]
# Using a loop
labels_loop = []
for num in numbers:
if num % 2 == 0:
labels_loop.append("Even")
else:
labels_loop.append("Odd")
# Using a list comprehension with conditional expression
labels_comp = ["Even" if num % 2 == 0 else "Odd" for num in numbers]
print(f"Loop result: {labels_loop}") # Output: ['Odd', 'Even', 'Odd', 'Even', 'Odd']
print(f"Comp. result: {labels_comp}") # Output: ['Odd', 'Even', 'Odd', 'Even', 'Odd']
Note: While you can combine both a conditional expression in the output and a filtering if
clause, it can quickly become hard to read.
graph TD subgraph "List Comprehension [...]" LC1[Start Processing Iterable] --> LC2[Process Item 1]; LC2 --> LC3[Process Item 2]; LC3 --> LC4[...] LC4 --> LC5[Process Last Item]; LC5 --> LC6(Entire List Created in Memory); end subgraph "Generator Expression (...)" GE1[Define Generator] --> GE2{"Iteration Request (<br>e.g., <i>next()</i>, <i>for</i> loop)"}; GE2 -- Request --> GE3[Process *one* item]; GE3 --> GE4(Yield Item); GE4 --> GE2; GE2 -- StopIteration --> GE5[Generator Exhausted]; end style LC6 fill:#ccf,stroke:#333,stroke-width:2px style GE4 fill:#cfc,stroke:#333,stroke-width:2px
Nested Comprehensions
List comprehensions can be nested to work with nested iterables, like lists of lists (matrices).
Syntax (Example for flattening a matrix):
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flattened = [element for row in matrix for element in row]
# Equivalent loop:
# flattened_loop = []
# for row in matrix:
# for element in row:
# flattened_loop.append(element)
print(flattened) # Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
The for
clauses are nested in the same order as they would appear in equivalent nested loops. Be cautious, as deeply nested comprehensions can become difficult to understand.
Dictionary Comprehensions
Similar syntax using curly braces {}
can create dictionaries. You need to provide both a key and a value expression, separated by a colon :
.
Syntax:
new_dict = {key_expression: value_expression for item in iterable if condition}
Example: Creating a dictionary of numbers and their squares
numbers = [1, 2, 3, 4, 5]
# Using a loop
squares_dict_loop = {}
for num in numbers:
squares_dict_loop[num] = num * num
# Using a dictionary comprehension
squares_dict_comp = {num: num * num for num in numbers}
print(f"Loop result: {squares_dict_loop}") # Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
print(f"Comp. result: {squares_dict_comp}") # Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
# Example with condition: Squares of even numbers only
even_squares_dict = {num: num * num for num in numbers if num % 2 == 0}
print(f"Even squares dict: {even_squares_dict}") # Output: {2: 4, 4: 16}
Set Comprehensions
Set comprehensions also use curly braces {}
but only have a single expression (like list comprehensions). They automatically handle uniqueness.
Syntax:
new_set = {expression for item in iterable if condition}
Example: Creating a set of unique squares of numbers (including duplicates in input)
numbers = [1, 2, 2, 3, 3, 3, 4, 5]
# Using a loop
squares_set_loop = set()
for num in numbers:
squares_set_loop.add(num * num)
# Using a set comprehension
squares_set_comp = {num * num for num in numbers}
print(f"Loop result: {squares_set_loop}") # Output: {1, 4, 9, 16, 25} (order may vary)
print(f"Comp. result: {squares_set_comp}") # Output: {1, 4, 9, 16, 25} (order may vary)
# Example with condition: Unique uppercase first letters
words = ["apple", "banana", "apricot", "blueberry", "cherry"]
first_letters_set = {word[0].upper() for word in words if len(word) > 0}
print(f"Unique first letters: {first_letters_set}") # Output: {'C', 'B', 'A'} (order may vary)
Code Examples
Animation: List Comprehension with Filtering
[num * num for num in numbers if num % 2 == 0]
Example 1: Filtering and Transforming Data
# data_processing_comp.py
data = [
{"name": "Alice", "score": 85},
{"name": "Bob", "score": 55},
{"name": "Charlie", "score": 92},
{"name": "David", "score": 70},
]
# Get names of students who passed (score >= 60) using list comprehension
passing_names = [student["name"] for student in data if student["score"] >= 60]
print(f"Passing student names: {passing_names}")
# Output: ['Alice', 'Charlie', 'David']
# Create a dictionary mapping names to 'Pass'/'Fail' status
status_dict = {
student["name"]: ("Pass" if student["score"] >= 60 else "Fail")
for student in data
}
print(f"Student statuses: {status_dict}")
# Output: {'Alice': 'Pass', 'Bob': 'Fail', 'Charlie': 'Pass', 'David': 'Pass'}
# Create a set of unique scores using set comprehension
unique_scores = {student["score"] for student in data}
print(f"Unique scores: {unique_scores}")
# Output: {85, 55, 92, 70} (order may vary)
Explanation:
- Demonstrates filtering (
if student["score"] >= 60
) in a list comprehension. - Shows using a conditional expression (
"Pass" if ... else "Fail"
) in a dictionary comprehension. - Uses a set comprehension to easily extract unique values.
Example 2: Nested List Comprehension (Matrix Transpose)
# matrix_transpose.py
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
]
# Transpose using nested loops
transposed_loop = []
num_rows = len(matrix)
num_cols = len(matrix[0]) # Assume matrix is not empty and rectangular
for j in range(num_cols):
new_row = []
for i in range(num_rows):
new_row.append(matrix[i][j])
transposed_loop.append(new_row)
# Transpose using nested list comprehension
transposed_comp = [[matrix[i][j] for i in range(num_rows)] for j in range(num_cols)]
print("Original Matrix:")
for row in matrix: print(row)
print("\nTransposed (Loop):")
for row in transposed_loop: print(row)
# Output:
# [1, 4, 7]
# [2, 5, 8]
# [3, 6, 9]
print("\nTransposed (Comprehension):")
for row in transposed_comp: print(row)
# Output:
# [1, 4, 7]
# [2, 5, 8]
# [3, 6, 9]
Explanation:
- Transposing a matrix involves swapping rows and columns.
- The nested list comprehension
[[matrix[i][j] for i in range(num_rows)] for j in range(num_cols)]
achieves this concisely. - The outer loop iterates through the columns (
j
), and the inner comprehension builds the new row by picking elements from each original row (i
) at that column indexj
.
Example 3: Generator Expression for Large File Processing (Revisited)
# large_file_genexp.py
import os
# Create a dummy large file (conceptual)
filename = "large_data.log"
try:
with open(filename, "w") as f:
for i in range(10000): # Simulate 10000 lines
f.write(f"LINE {i}: Data value {i*i % 137}\n")
f.write("ERROR: Critical failure detected on line 10001\n")
for i in range(10000, 15000):
f.write(f"LINE {i}: Data value {i*i % 101}\n")
except IOError:
print("Error creating dummy file.")
# Process the file using a generator expression to find lines with 'LINE 12'
# without loading the whole file into memory.
try:
with open(filename, "r") as f:
# Generator expression - processes lazily
matching_lines = (line.strip() for line in f if "LINE 12" in line)
print(f"--- Lines containing 'LINE 12' (from {filename}) ---")
count = 0
for line in matching_lines: # Pulls lines from generator
print(line)
count += 1
if count >= 5: # Limit output for demonstration
print("... (limiting output)")
break
if count == 0:
print("No matching lines found.")
except FileNotFoundError:
print(f"Error: File '{filename}' not found.")
finally:
if os.path.exists(filename):
# os.remove(filename) # Uncomment to clean up
pass
Explanation:
- This example emphasizes the memory efficiency of generator expressions for tasks like searching large files.
- The generator
(line.strip() for line in f if "LINE 12" in line)
only reads and processes lines as needed by thefor
loop, avoiding loading the entire (potentially huge) file content.
Common Mistakes or Pitfalls
- Readability: While powerful, overly complex or deeply nested comprehensions can become very difficult to read and understand compared to a well-structured
for
loop. Prioritize clarity. - Side Effects: List comprehensions are primarily for creating new lists based on existing iterables. Avoid including expressions with significant side effects (like printing or modifying external state) within a comprehension, as it can be confusing. Use regular
for
loops for tasks dominated by side effects. - Generator Exhaustion: Forgetting that generator expressions (like all generators) are exhausted after one full iteration.
- Syntax Errors: Mixing up brackets
[]
(list comp), parentheses()
(gen exp), and curly braces{}
(set/dict comp), or getting the order ofexpression
,for
, andif
clauses wrong.
Chapter Summary
Type | Basic Syntax | Output | Key Feature | Example (Conceptual) |
---|---|---|---|---|
List Comprehension | [expression for item in iterable] |
List [] |
Creates a new list in memory immediately (eager evaluation). | [x*x for x in range(5)] -> [0, 1, 4, 9, 16] |
Generator Expression | (expression for item in iterable) |
Generator object | Produces items one by one on demand (lazy evaluation), memory efficient for large sequences. | (x*x for x in range(5)) -> <generator object> |
Dictionary Comprehension | {key_expr: val_expr for item in iterable} |
Dictionary {} |
Creates a new dictionary. Requires key and value expressions. | {x: x*x for x in range(3)} -> {0: 0, 1: 1, 2: 4} |
Set Comprehension | {expression for item in iterable} |
Set {} |
Creates a new set, automatically handling uniqueness. | {x % 2 for x in [1, 2, 1, 3]} -> {0, 1} |
Filtering (Applies to all) | [... for item in iterable if condition] (... for item in iterable if condition) {... for item in iterable if condition} |
Filtered collection | Includes only items from the iterable where the if condition is true. |
[x for x in range(5) if x > 2] -> [3, 4] |
Conditional Expression (Applies to all) | [val_true if cond else val_false for item in iterable] (... for item in iterable) {... for item in iterable} |
Transformed collection | Applies different expressions based on a condition for each item. Affects the output value. | ['even' if x%2==0 else 'odd' for x in range(3)] -> ['even', 'odd', 'even'] |
Nested Comprehension (List example) | [elem for row in matrix for elem in row] |
List (often flattened) | Processes nested iterables. Order matches nested loops. | [i for r in [[1],[2,3]] for i in r] -> [1, 2, 3] |
- List comprehensions (
[expr for item in iterable if cond]
) provide a concise way to create lists. - Generator expressions (
(expr for item in iterable if cond)
) create generators, offering memory efficiency through lazy evaluation. - Conditional filtering (
if cond
) can be added to include only certain items from the iterable. - Conditional expressions (
val_true if cond else val_false
) can be used in theexpr
part to transform items differently. - Nested comprehensions allow processing of nested iterables, mirroring nested loops.
- Dictionary comprehensions (
{key_expr: val_expr for ...}
) create dictionaries. - Set comprehensions (
{expr for ...}
) create sets, automatically handling uniqueness. - Comprehensions and expressions enhance code conciseness but should be used judiciously to maintain readability.
Exercises & Mini Projects
Exercises
- Squares of Evens: Use a list comprehension to create a list containing the squares of all even numbers from 0 to 20 (inclusive).
- Word Lengths: Given a sentence string (e.g., “This is a sample sentence”), use a dictionary comprehension to create a dictionary where keys are the words and values are the lengths of those words.
- Unique Vowels: Given a string, use a set comprehension to create a set containing all the unique vowels (a, e, i, o, u, case-insensitive) present in the string.
- Conditional Labeling: Use a list comprehension and a conditional expression to create a list from
range(10)
where numbers less than 5 are labeled “small” and numbers 5 or greater are labeled “large”. - Flatten List: Given a list of lists
nested = [[1, 2], [3, 4, 5], [6]]
, use a nested list comprehension to create a single flattened list[1, 2, 3, 4, 5, 6]
.
Mini Project: Refactoring with Comprehensions
Goal: Take a previous exercise or mini-project that used loops to create a list, dictionary, or set, and refactor it to use the corresponding comprehension or expression.
Choose one of the following (or similar code you’ve written):
- Word Counter (Chapter 7): Refactor the loop that builds the
word_counts
dictionary to use a dictionary comprehension (this might be slightly tricky due to needing the counts; consider usingcollections.Counter
first, then a comprehension if desired, or stick to the original loop if the comprehension becomes too complex). - Event Reminder (Chapter 20): Refactor the part where you process the
events
list. Can you use a list comprehension or generator expression to create tuples or objects containing the event name and the calculated days remaining? - Any Exercise Creating a List: Find an exercise from earlier chapters (e.g., calculating factorials into a list, filtering numbers) where you used a
for
loop and.append()
to build a list. Rewrite that logic using a list comprehension.
Steps:
- Identify the code block with the
for
loop that populates a list, dictionary, or set. - Analyze the logic: What is the source iterable? What transformation is applied to each item? Is there any filtering condition?
- Rewrite the logic using the appropriate comprehension syntax (list, dict, set, or generator expression).
- Ensure the refactored code produces the same result as the original loop-based code.
- Compare the readability of the two versions. Is the comprehension clearer in this case?
Additional Sources: