Programming with Python | Chapter 3: Strings, Indexing, and Slicing

Programming with Python | Chapter 3: Strings, Indexing, and Slicing

Chapter Objectives

  • Understand that strings are sequences of characters.
  • Access individual characters in a string using indexing (positive and negative).
  • Extract substrings using slicing with start, stop, and step values.
  • Learn about string immutability.
  • Utilize common string methods for manipulation (e.g., len(), .lower(), .upper(), .strip(), .replace(), .find(), .startswith(), .endswith()).
  • Format strings effectively using f-strings.

Introduction

In the previous chapter, we introduced strings (str) as one of Python’s fundamental data types for representing text. This chapter explores strings in more detail. We’ll treat strings as sequences, learning how to access specific characters or ranges of characters using techniques called indexing and slicing. We will also discover that strings are immutable in Python, meaning they cannot be changed directly once created. Finally, we’ll cover several built-in string methods that allow us to perform common operations like changing case, removing whitespace, searching, and replacing text, along with modern f-strings for easy formatting.

Theory & Explanation

Strings as Sequences

Think of a string as an ordered sequence of characters. Each character in the string has a specific position, or index. Python uses zero-based indexing, meaning the first character is at index 0, the second at index 1, and so on.

Indexing: Accessing Single Characters

graph LR
    A0["P (0 / -6)"] --> A1["y (1 / -5)"]
    A1 --> A2["t (2 / -4)"]
    A2 --> A3["h (3 / -3)"]
    A3 --> A4["o (4 / -2)"]
    A4 --> A5["n (5 / -1)"]
    classDef strBox fill:#f9f,stroke:#333,stroke-width:1px;
    class A0,A1,A2,A3,A4,A5 strBox;

You can access a single character within a string using square brackets [] with the index number inside.

  • Positive Indexing: Starts from 0 for the first character and increases.
  • Negative Indexing: Starts from -1 for the last character, -2 for the second-to-last, and so on. This is useful for accessing characters from the end of the string without needing to know its length.
Python
message = "Python"

# Positive Indexing
first_char = message[0]  # 'P'
second_char = message[1] # 'y'

# Negative Indexing
last_char = message[-1] # 'n'
second_last_char = message[-2] # 'o'

print("First character:", first_char)
print("Last character:", last_char)
# print(message[6]) # This would cause an IndexError because the valid indices are 0-5

Slicing: Extracting Substrings

Slicing allows you to extract a portion (a substring) from a string. The syntax is string[start:stop:step].

  • start: The index where the slice begins (inclusive). If omitted, defaults to 0.
  • stop: The index where the slice ends (exclusive). If omitted, defaults to the end of the string.
  • step: The amount to increment the index by (defaults to 1). A step of 2 takes every other character, a step of -1 reverses the string.
Python
text = "Programming"

# Indices:  P  r  o  g  r  a  m  m  i  n  g
#           0  1  2  3  4  5  6  7  8  9 10
#         -11-10 -9 -8 -7 -6 -5 -4 -3 -2 -1

# Get "Program" (indices 0 up to, but not including, 7)
sub1 = text[0:7]
print(sub1) # Output: Program

# Get "gram" (indices 3 up to, but not including, 7)
sub2 = text[3:7]
print(sub2) # Output: gram

# Omit start (from beginning up to index 4)
sub3 = text[:4]
print(sub3) # Output: Prog

# Omit stop (from index 8 to the end)
sub4 = text[8:]
print(sub4) # Output: ing

# Every second character (step=2)
sub5 = text[0:11:2] # or text[::2]
print(sub5) # Output: Pormig

# Reverse the string (step=-1)
reversed_text = text[::-1]
print(reversed_text) # Output: gnimmargorP

# Using negative indices in slicing
# Get "mming" (from index -4 to the end)
sub6 = text[-4:]
print(sub6) # Output: mming

String Immutability

An important characteristic of Python strings is that they are immutable. This means that once a string object is created, its contents cannot be changed in place. Operations that seem to modify a string (like .replace() or .lower()) actually create and return a new string object with the modified content. The original string remains unchanged.

Python
greeting = "Hello"
# greeting[0] = "J" # This will cause a TypeError: 'str' object does not support item assignment

# To "change" the greeting, you create a new string and reassign the variable
new_greeting = "J" + greeting[1:] # Concatenate "J" with a slice of the old string
print(new_greeting) # Output: Jello
print(greeting)     # Output: Hello (original string is unchanged)

Common String Methods

Methods are like functions that belong to an object (in this case, a string object). You call them using dot notation (string_variable.method_name()).

Method Description Example (s = ” Hello World! “) Result
len(s) Returns the number of characters in the string (including spaces) len(s) 17
s.lower() Returns a new string with all characters converted to lowercase s.lower() ” hello world! “
s.upper() Returns a new string with all characters converted to uppercase s.upper() ” HELLO WORLD! “
s.strip() Returns a new string with leading/trailing whitespace removed s.strip() “Hello World!”
s.lstrip() Returns a new string with leading whitespace removed s.lstrip() “Hello World! “
s.rstrip() Returns a new string with trailing whitespace removed s.rstrip() ” Hello World!”
s.replace(old, new) Returns a new string where occurrences of old are replaced with new s.strip().replace(“o”, “a”) “Hella Warld!”
s.find(substring) Returns the starting index of the first occurrence of substring. Returns -1 if not found s.strip().find(“World”) 6
s.startswith(prefix) Returns True if the string starts with prefix, False otherwise s.strip().startswith(“He”) True
s.endswith(suffix) Returns True if the string ends with suffix, False otherwise s.strip().endswith(“!”) True
s.split(separator) Returns a list of substrings split by separator (Default is whitespace) s.strip().split(” “) [‘Hello’, ‘World!’]
s.isdigit() Returns True if all characters are digits, False otherwise “123”.isdigit() True
s.isalpha() Returns True if all characters are letters, False otherwise “Python”.isalpha() True

Note: len() is a built-in function, not a method, so it’s called as len(string) instead of string.len().

Formatted String Literals (f-strings)

Introduced in Python 3.6, f-strings provide a concise and readable way to embed expressions inside string literals. You prefix the string with the letter f or F and enclose expressions in curly braces {}. Check Out this Lab for F Strings -> F Strings Lab

Python
name = "Alice"
age = 30
city = "New York"

# Old way (using string concatenation - cumbersome)
# message_old = "My name is " + name + ", I am " + str(age) + " years old and live in " + city + "."

# Using f-strings (cleaner and more readable)
message_fstring = f"My name is {name}, I am {age} years old and live in {city}."
print(message_fstring)
# Output: My name is Alice, I am 30 years old and live in New York.

# You can embed expressions directly
item = "laptop"
price = 1200.50
tax_rate = 0.08
total_cost = price * (1 + tax_rate)

invoice_summary = f"Item: {item.upper()}\nPrice: ${price:.2f}\nTotal Cost (incl. tax): ${total_cost:.2f}"
print(invoice_summary)
# Output:
# Item: LAPTOP
# Price: $1200.50
# Total Cost (incl. tax): $1296.54
  • Inside the {} you can put variable names, calculations, or even function/method calls.
  • You can add formatting specifiers after a colon (:), like :.2f to format a float to 2 decimal places, or .upper() to call a method.

Code Examples

Example 1: Indexing, Slicing, and Immutability

Python
# string_ops.py

filename = "document_report_final.txt"

# Indexing
print(f"First character: {filename[0]}")        # d
print(f"Last character: {filename[-1]}")         # t
print(f"Extension character 1: {filename[-3]}")  # t

# Slicing
base_name = filename[0:15] # Get "document_report"
print(f"Base name: {base_name}")

extension = filename[-4:] # Get ".txt"
print(f"Extension: {extension}")

report_part = filename[9:15] # Get "report"
print(f"Report part: {report_part}")

# Demonstrating Immutability
print(f"Original filename ID: {id(filename)}")
cleaned_filename = filename.replace("_", " ").capitalize() # Creates a NEW string
print(f"Cleaned filename: {cleaned_filename}")
print(f"Cleaned filename ID: {id(cleaned_filename)}") # Different ID
print(f"Original filename is unchanged: {filename}")
print(f"Original filename ID is unchanged: {id(filename)}") # Same as original ID

Explanation:

  • We use indexing ([]) with positive and negative numbers to access individual characters.
  • Slicing ([:]) extracts substrings like the base name and extension.
  • The .replace() method doesn’t change the original filename. It returns a new string, which we assign to cleaned_filename. The id() function shows that filename and cleaned_filename refer to different objects in memory.

Example 2: String Methods and f-strings

Python
# methods_fstrings.py

raw_data = "   ProductID:12345, Name: Gadget Pro, Price: 99.99   "
print(f"Original data: '{raw_data}'")

# Clean the data
cleaned_data = raw_data.strip()
print(f"Stripped data: '{cleaned_data}'")

# Find the position of key information
name_start_index = cleaned_data.find("Name:") + len("Name:")
price_start_index = cleaned_data.find("Price:") + len("Price:")

# Extract information using slicing
product_name = cleaned_data[name_start_index : price_start_index - 2].strip() # -2 to remove ', '
price_str = cleaned_data[price_start_index:].strip()

# Convert price to float
price_float = float(price_str)

# Present the information using an f-string
output = f"""
Product Report:
--------------------
Name:  {product_name.upper()}
Price: ${price_float:.2f}
Is it expensive? {price_float > 50.0}
--------------------
"""
print(output)

# Check if data starts/ends with specific text
print(f"Does raw data start with space? {raw_data.startswith(' ')}") # True
print(f"Does cleaned data end with '99'? {cleaned_data.endswith('99')}") # True

Explanation:

  • .strip() removes leading/trailing whitespace from raw_data.
  • .find() locates the starting positions of “Name:” and “Price:”. We add the length of these labels to get the index where the actual value starts.
  • Slicing extracts the product_name and price_str. Note the .strip() used again on the extracted name to remove any potential space before the comma.
  • float() converts the extracted price_str to a number for potential calculations.
  • An f-string (using triple quotes f"""...""" for multi-line) formats the extracted data neatly for output. It includes calling .upper() on the name and formatting the price.
  • .startswith() and .endswith() check the beginning and end of strings.

Common Mistakes or Pitfalls

  • IndexError: Trying to access an index that is outside the valid range of the string (e.g., accessing my_string[10] when the string only has 5 characters).
  • TypeError (Immutability): Attempting to modify a character in a string directly using index assignment (e.g., my_string[0] = 'X'). Remember to create new strings instead.
  • Off-by-One Errors in Slicing: Forgetting that the stop index in slicing is exclusive. my_string[0:5] includes characters at indices 0, 1, 2, 3, and 4, but not 5.
  • Forgetting Methods Return New Strings: Calling a method like .lower() or .strip() without assigning the result back to a variable (or using it directly) means the change is lost, as the original string remains unchanged.my_string = " Data " my_string.strip() # WRONG: The result isn't saved print(my_string) # Output: " Data " (still has spaces) my_string = my_string.strip() # CORRECT: Assign the result back print(my_string) # Output: "Data"
  • Case Sensitivity: Methods like .find() and .replace() are case-sensitive by default. "Hello".find("h") will return -1.

Chapter Summary

  • Strings are ordered, immutable sequences of characters.
  • Indexing (string[index]) accesses single characters (0-based, negative indices count from the end).
  • Slicing (string[start:stop:step]) extracts substrings. The stop index is exclusive.
  • Strings cannot be changed in place (immutability). Methods that modify strings return new string objects.
  • Useful string methods include .lower(), .upper(), .strip(), .replace(), .find(), .startswith(), .endswith(), .split().
  • The len() function returns the string’s length.
  • f-strings (f"...") provide a convenient way to embed expressions and format variables within strings using {expression} syntax.

Exercises & Mini Projects

Exercises

  1. Character Explorer: Given the string s = "Fundamentals", write code to:
    • Print the first character.
    • Print the last character using negative indexing.
    • Print the character ‘d’.
    • Try to access the character at index 15 and observe the IndexError.
  2. Slicing Practice: Using the string s = "Programming is fun!", extract and print the following substrings:
    • “Programming”
    • “fun!”
    • “gram”
    • The entire string reversed.
    • Every third character of the string.
  3. Method Chaining: Start with the string data = " python programming ". Use string methods, potentially chaining them together (e.g., data.strip().upper()), to achieve the following and print the result at each step:
    • Remove leading/trailing whitespace.
    • Convert the result to uppercase.
    • Replace “PYTHON” with “JAVA”.
    • Check if the final result ends with “MING”.
  4. f-string Formatting: Create variables for item = "book", price = 15.99, and quantity = 3. Use an f-string to print a message like: “Order details: 3 units of book(s) cost $47.97 total.” Ensure the total cost is calculated within the f-string and formatted to two decimal places.
  5. Simple Validator: Ask the user to input a potential username using input(). Then, use string methods to check if the username:
    • Contains any spaces (Hint: use find() or the in operator which we’ll see later, find() is fine for now).
    • Is purely alphabetical (Hint: isalpha()).Print appropriate messages based on these checks (e.g., “Username cannot contain spaces.”, “Username must be letters only.”).

Mini Project: Text Sanitizer

Goal: Create a script sanitizer.py that takes a potentially messy string input and cleans it up for basic processing.

Steps:

  1. Define a sample messy string variable, e.g., raw_text = " EXTRA SPACES AND weird CASE \n\t ". (The \n is a newline, \t is a tab – both are whitespace).
  2. Use string methods to perform the following cleaning steps, printing the result after each major transformation:
    • Convert the entire string to lowercase.
    • Remove all leading and trailing whitespace (including newlines and tabs).
    • Replace occurrences of “extra” with “removed”.
    • Replace occurrences of “weird” with “normalized”.
  3. Finally, use an f-string to print the original string and the fully sanitized string, clearly labeling each.

More Sources:

https://docs.python.org/3/library/string.html#module-string

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top