Chapter 55: Process Concepts: Process ID (PID), Parent PID (PPID), Process States
Chapter Objectives
By the end of this chapter, you will be able to:
- Understand the fundamental concept of a process in a Linux environment and its role as the basic unit of execution.
- Explain the significance of the Process ID (PID) and Parent Process ID (PPID) and how they establish relationships within the process hierarchy.
- Describe the various states a process can be in throughout its lifecycle, including running, sleeping, stopped, and zombie.
- Implement C programs that create and manage child processes using the
fork()
andexec()
family of system calls. - Utilize standard Linux command-line tools and the
/proc
filesystem to inspect the state and attributes of running processes on a Raspberry Pi 5. - Debug common process-related issues, such as orphaned processes and the accumulation of zombie processes.
Introduction
The process is the most fundamental concept of a program in execution in the field of computing. For an embedded Linux system, which must often manage numerous concurrent tasks—from reading sensor data and updating a display to handling network communication—a deep understanding of process management is not merely academic; it is essential for building robust, reliable, and efficient devices. Every command you run, every application you launch, and every background service that keeps the system humming is encapsulated within a process. These processes are the active, living entities that the Linux kernel must skillfully juggle, allocate resources to, and protect from one another.
This chapter delves into the core of process identity and lifecycle. We will explore how the kernel uniquely identifies every process with a Process ID (PID) and tracks its lineage through a Parent Process ID (PPID). This parent-child relationship forms a hierarchical tree structure that is foundational to how Linux organizes and manages tasks. We will then journey through the lifecycle of a process, examining the different states it transitions through, from its creation to its termination. Understanding these states is critical for debugging, as a process that is unresponsive or consuming excessive CPU time often reveals its condition through its current state. Using the Raspberry Pi 5 as our practical platform, you will move from theory to hands-on application, learning to create, monitor, and interpret the behavior of processes, a skill set that is indispensable for any embedded Linux developer.
Technical Background
The Anatomy of a Process
To truly grasp what a process is, it’s helpful to distinguish it from a program. A program is a passive entity—a file on your disk containing a set of instructions and data, like the binary file /bin/ls
. A process, on the other hand, is the active, dynamic instance of that program when it is loaded into memory and executed. You can run the same program multiple times, and each time, the Linux kernel creates a new, distinct process with its own unique identity and resources.
The kernel sees a process as an abstract entity that requires resources to perform its task. The primary resource is, of course, CPU time. However, a process is much more than just a stream of instructions. The kernel maintains a significant amount of information about each process in a C structure known as the process descriptor, which is of type task_struct
in the kernel source code. This descriptor is the kernel’s all-encompassing data structure for managing a process, holding every piece of information it needs.
This includes the process’s identity. The most fundamental identifier is the Process ID (PID), a unique positive integer assigned to the process upon its creation. PIDs are allocated sequentially. When the system boots, the kernel starts its first user-space process, typically called init
or systemd
, and assigns it a PID of 1. This process is the great ancestor of all other processes that will ever run on the system. Every subsequent process created will receive the next available PID.
graph TD subgraph Process Tree A(<b>init/systemd</b><br><i>PID: 1</i><br><i>PPID: 0</i>) A --> B(<b>Login Shell / Service</b><br><i>PID: 550</i><br><i>PPID: 1</i>) A --> C("<b>System Service (e.g., sshd)</b><br><i>PID: 620</i><br><i>PPID: 1</i>") B --> D("<b>User Shell (bash)</b><br><i>PID: 1125</i><br><i>PPID: 550</i>") D --> E("<b>Executed Command (ps)</b><br><i>PID: 1140</i><br><i>PPID: 1125</i>") end classDef primary fill:#1e3a8a,stroke:#1e3a8a,stroke-width:2px,color:#ffffff classDef process fill:#0d9488,stroke:#0d9488,stroke-width:1px,color:#ffffff classDef system fill:#8b5cf6,stroke:#8b5cf6,stroke-width:1px,color:#ffffff class A,B,C,D,E process class A system
Just as important as the PID is the Parent Process ID (PPID). The PPID is simply the PID of the process that created it. This parent-child relationship is central to Linux’s design. When you type a command like ls -l
into your shell, the shell process (the parent) creates a new child process to execute the ls
program. The new ls
process will have its own unique PID, and its PPID will be the PID of your shell. This creates a clear and traceable hierarchy. If a parent process terminates before its children, these children become “orphaned.” In this case, they are “adopted” by the initial init
process (PID 1), which becomes their new parent. This ensures no process is ever left unaccounted for in the system’s hierarchy.
Beyond identity, the process descriptor contains a wealth of other information that defines the process’s context. This includes its memory management information, such as pointers to the page tables that map its virtual addresses to physical memory. This ensures each process operates in its own protected virtual address space, preventing it from interfering with the kernel or other processes. It also holds the process’s credentials, such as its user ID (UID) and group ID (GID), which determine its permissions for accessing files and other system resources. Finally, it tracks the set of open files, maintaining a list of file descriptors that the process can read from or write to. All these components together form the complete execution context of a process.
The Process Lifecycle: From Creation to Termination
A process’s life is a well-defined journey with distinct stages, managed through a handful of powerful system calls. The primary mechanism for creating a new process in Linux is the fork()
system call.
When a process calls fork()
, the kernel creates a near-exact duplicate of the calling process. The new process, the child, gets its own unique PID, but it inherits a copy of almost everything else from the parent. This includes the parent’s memory space (using a clever technique called copy-on-write to make it efficient), its open file descriptors, and its credentials. The fork()
call is unique in that it is called once but returns twice: once in the parent process and once in the child. In the parent’s context, fork()
returns the PID of the newly created child. In the child’s context, it returns 0. This difference is the crucial mechanism that allows the program’s logic to diverge, enabling the parent and child to perform different tasks. A common pattern is for the parent to use the returned PID to monitor the child, perhaps by waiting for it to finish its work.
However, creating a clone of a process is often just the first step. Usually, the goal is not to run two identical copies of the same program but to run a different program. This is accomplished using the exec()
family of system calls (e.g., execlp()
, execve()
). When a process calls one of the exec()
functions, the kernel completely overlays the calling process’s memory space with the new program specified in the exec()
call. The old program’s code, data, and stack are discarded, and the new program starts executing from its beginning. Crucially, the PID does not change. The process is the same; only the program it is executing has been replaced.
This fork()
-then-exec()
pattern is the cornerstone of command execution in Unix-like systems. When you type a command in your shell, the shell first calls fork()
to create a child process. Then, this child process immediately calls exec()
to load and run the command you specified. The parent shell, meanwhile, typically calls wait()
or waitpid()
, a system call that suspends its execution until the child process terminates, allowing it to retrieve the child’s exit status.
The final stage of the lifecycle is termination. A process can terminate in several ways. The most common is a normal exit, which occurs when the program finishes its task and calls the exit()
system call. The process can also be terminated involuntarily by receiving a signal that it cannot handle or ignore, such as the SIGKILL
signal. When a process terminates, the kernel releases most of its resources, but it retains a small amount of information in the process table—specifically, the process’s PID and its exit status. It keeps this information so that the parent process has an opportunity to find out how its child terminated. This lingering, defunct state is what leads to the concept of a “zombie” process.
Understanding Process States
At any given moment, every process on the system is in a specific state. This state reflects its current activity and is what the kernel’s scheduler uses to decide which process to run next. You can view these states using tools like ps
or top
.
stateDiagram-v2 direction LR [*] --> Runnable: Process Created (fork) state "Runnable / Running (R)" as Runnable state "Interruptible Sleep (S)" as Sleep_S state "Uninterruptible Sleep (D)" as Sleep_D state "Stopped (T)" as Stopped state "Zombie (Z)" as Zombie Runnable --> Runnable: Scheduler Preemption Runnable --> Sleep_S: Waiting for Event (e.g., I/O, timer) Runnable --> Sleep_D: Waiting for Hardware (Uninterruptible I/O) Runnable --> Stopped: Signal Received (SIGSTOP, SIGTSTP) Sleep_S --> Runnable: Event Occurs Sleep_S --> Stopped: Signal Received (SIGSTOP) Sleep_D --> Runnable: Hardware I/O Complete Stopped --> Runnable: Signal Received (SIGCONT) Runnable --> Zombie: exit() called Zombie --> [*]: Parent calls wait() state "Kernel Space" as Kernel { note right of Runnable On CPU or ready in run queue Managed by the kernel scheduler end note note right of Sleep_S Waiting for an event Can be woken by signals Consumes no CPU end note note left of Sleep_D Waiting for I/O, cannot be interrupted Unkillable until I/O finishes end note note right of Stopped Halted by a signal Awaiting SIGCONT to resume end note note right of Zombie Process terminated, awaiting parent to read exit status Consumes no resources except a process table entry end note }
The primary states are:
- R (Runnable/Running): A process in this state is either currently executing on a CPU core or is ready to execute and waiting in a run queue for the scheduler to give it a turn. The
ps
command shows ‘R’ for both scenarios; from the user’s perspective, any process that is not waiting for an external event is considered runnable. - S (Interruptible Sleep): This is one of the most common states for a process. A process enters this state when it is waiting for something to happen. For example, it might be waiting for user input from the keyboard, for data to arrive from the network, or for a specific amount of time to pass (e.g., using the
sleep()
function). It is called “interruptible” because it can be woken up prematurely by a signal. Most processes on a typical system spend the vast majority of their time in this state, consuming no CPU cycles. - D (Uninterruptible Sleep): This state is similar to interruptible sleep, but with a critical difference: a process in this state cannot be woken up by a signal. It will only wake up when the event it is waiting for occurs, which is almost always a direct response from hardware, typically related to I/O (Input/Output) operations. For example, a process trying to read a block from a disk drive might enter this state until the disk controller signals that the data is ready. This state is necessary to prevent data corruption that could occur if a process were interrupted in the middle of a sensitive hardware operation. While essential, processes stuck in a ‘D’ state can be problematic as they cannot be killed, even with
SIGKILL
, and can only be cleared by a reboot if the underlying I/O issue is unrecoverable. - T (Stopped or Traced): A process enters this state when it receives a specific signal, such as
SIGSTOP
. This is commonly used for job control in a shell (e.g., pressing Ctrl-Z) or for debugging. A debugger likegdb
will stop a process to allow a developer to inspect its memory and state. The process will remain stopped until it receives aSIGCONT
signal to resume its execution. - Z (Zombie): A process that has terminated but whose parent has not yet “reaped” it by calling
wait()
orwaitpid()
is called a zombie. It is not a running process and consumes no CPU resources. However, it still occupies a slot in the kernel’s process table, holding onto its PID and exit code. The kernel keeps this minimal entry so the parent can learn about its child’s fate. Once the parent callswait()
, the zombie is fully removed from the system. A small number of zombies appearing and disappearing is normal. However, a large or persistent accumulation of zombie processes indicates a bug in the parent process, which is failing to properly clean up after its children. This is a resource leak that can eventually prevent new processes from being created if the process table fills up.
Practical Examples
This section provides hands-on examples for the Raspberry Pi 5, demonstrating how to create, inspect, and manage processes. You will need a Raspberry Pi 5 running the latest Raspberry Pi OS (or a similar Debian-based distribution) and access to its command line, either directly or via SSH.
Inspecting Processes with ps
and /proc
The most direct way to see the processes running on your system is with the ps
command.
Step-by-Step Inspection:
1. Log in to your Raspberry Pi 5.
2. Run the ps
command with options to show all processes in a user-friendly, full-format listing. The aux
options are classic and effective.
ps aux
3. Analyze the output. You will see a table of all running processes. Pay close attention to these columns:
USER
: The user who owns the process.PID
: The Process ID.PPID
: The Parent Process ID.STAT
: The current state of the process (R, S, D, T, Z).COMMAND
: The command that launched the process.
oot 1 0.1 0.2 169420 11068 ? Ss 04:00 0:02 /sbin/init
root 450 0.0 0.1 88220 7100 ? Ss 04:01 0:00 /lib/systemd/systemd-journald
pi 1234 0.0 0.2 10500 5500 pts/0 Ss 05:10 0:00 -bash
pi 1256 0.0 0.1 8900 3200 pts/0 R+ 05:12 0:00 ps aux
In this example, you can see systemd
running as PID 1. The user pi
is running a bash shell (PID 1234), whose parent is likely the login service. The ps aux
command itself is running as PID 1256, and its parent (PPID) would be 1234, the shell that launched it. Its state is ‘R’ because it is actively running.
4. Explore the /proc
Filesystem. The ps
command gets its information from the /proc
filesystem, a virtual filesystem that provides a real-time window into the kernel’s data structures. Each running process has a corresponding directory named after its PID.
# Let's inspect the shell process (assuming its PID is 1234)
ls -l /proc/1234
You will see a list of virtual files and directories. The file status
is particularly informative.
cat /proc/1234/status
Expected Output (snippet):
Name: bash
State: S (sleeping)
Pid: 1234
PPid: 1098
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
...
This provides a detailed, human-readable summary of the process’s state, PID, PPID, user/group IDs, and memory usage. Notice the state is ‘S’ (sleeping), as the shell is waiting for your next command.
Creating Processes with C
Now, let’s write a C program to create a child process using fork()
. This example illustrates the fundamental fork()
–exec()
pattern.
sequenceDiagram actor User participant Shell as Parent Process participant Child as Child Process participant Kernel User->>+Shell: Enters command (e.g., "ls -l") Shell->>Kernel: fork() Kernel->>Shell: return child_pid Kernel->>+Child: return 0 Note over Shell,Child: Both processes are now running Child->>Kernel: execlp("ls", "ls", "-l", NULL) Note right of Child: Child process image is replaced by 'ls' Kernel->>Child: (Does not return on success) Shell->>Kernel: waitpid(child_pid, &status, 0) Note left of Shell: Parent is suspended,<br/>waiting for child to terminate Child->>Kernel: exit(0) deactivate Child Kernel->>Shell: Child terminated, return status Shell->>-User: Displays prompt again
Code Snippet: process_creator.c
Create a file named process_creator.c
on your Raspberry Pi.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
/**
* main - Demonstrates the fork-exec-wait pattern.
*
* This program forks a child process. The child process then uses execlp
* to run the 'ls -l /' command. The parent process waits for the child
* to complete and then prints a message.
*
* Return: 0 on success, 1 on failure.
*/
int main(void) {
pid_t child_pid;
int status;
// Create a new process by duplicating the current one.
child_pid = fork();
if (child_pid == -1) {
// fork() returns -1 if the creation of a child process was unsuccessful.
perror("fork failed");
exit(EXIT_FAILURE);
}
if (child_pid == 0) {
// This block is executed by the CHILD process, because fork() returns 0 to the child.
printf("CHILD: I am the child process, my PID is %d\n", getpid());
printf("CHILD: My parent's PID is %d\n", getppid());
printf("CHILD: Now, I will execute 'ls -l /'...\n\n");
// Replace the current process image with the 'ls' program.
// execlp searches for the command in the system's PATH.
execlp("ls", "ls", "-l", "/", NULL);
// If execlp returns, it must have failed.
perror("execlp failed");
exit(EXIT_FAILURE); // Exit child with failure status
} else {
// This block is executed by the PARENT process.
// fork() returns the PID of the child to the parent.
printf("PARENT: I am the parent process, my PID is %d\n", getpid());
printf("PARENT: I created a child with PID %d\n", child_pid);
printf("PARENT: I will now wait for my child to finish.\n");
// Wait for the child process to terminate.
// The 'status' variable will hold the child's exit status.
waitpid(child_pid, &status, 0);
printf("\nPARENT: Child process has finished.\n");
// Check if the child terminated normally.
if (WIFEXITED(status)) {
printf("PARENT: Child exited with status code %d.\n", WEXITSTATUS(status));
}
}
printf("PARENT: All done. Exiting.\n");
return EXIT_SUCCESS;
}
Build and Run Steps:
1. Save the code to process_creator.c
.
2. Compile the program using GCC. The -o
flag specifies the output executable name.
gcc process_creator.c -o process_creator
3. Run the executable.
./process_creator
Expected Output:
The order of the parent and child printf
statements may vary due to scheduling, but the overall flow will be consistent.
PARENT: I am the parent process, my PID is 2345
PARENT: I created a child with PID 2346
PARENT: I will now wait for my child to finish.
CHILD: I am the child process, my PID is 2346
CHILD: My parent's PID is 2345
CHILD: Now, I will execute 'ls -l /'...
total 72
drwxr-xr-x 2 root root 4096 Apr 18 10:30 bin
drwxr-xr-x 4 root root 1024 Jan 1 1970 boot
... (rest of 'ls -l /' output) ...
PARENT: Child process has finished.
PARENT: Child exited with status code 0.
PARENT: All done. Exiting.
This example perfectly demonstrates the core concepts: the parent creates a child, the child transforms itself with exec
, and the parent waits for the child to complete its task before exiting.
Demonstrating a Zombie Process
To see a zombie process, we need to create a child that exits while the parent is still alive but not waiting for it. The parent can then inspect the system before it finally reaps the child.
sequenceDiagram participant Parent participant Child participant Kernel Parent->>Kernel: fork() Kernel-->>Parent: return child_pid Kernel-->>Child: return 0 Child->>Kernel: exit(0) activate Kernel Note over Child,Kernel: Child terminates execution.<br>Resources released, but process table entry remains. Kernel->>Parent: Sends SIGCHLD signal deactivate Kernel rect rgb(254, 249, 195) note right of Child: Child is now a Zombie (Z) state.<br>Waiting for parent to call wait(). end Parent->>Parent: Continues other work (e.g., sleep(20)) Parent->>Kernel: wait(&status) activate Kernel Note over Parent,Kernel: Parent reaps child. Kernel-->>Parent: return child_pid, status Note right of Child: Zombie is removed from process table. deactivate Kernel
Code Snippet: zombie_maker.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int main(void) {
pid_t child_pid = fork();
if (child_pid > 0) {
// PARENT process
printf("PARENT: Child created with PID %d.\n", child_pid);
printf("PARENT: I'm going to sleep for 20 seconds.\n");
printf("PARENT: Check for the zombie process now using 'ps aux | grep zombie_maker'\n");
// Sleep, giving the child time to exit and become a zombie.
sleep(20);
// Now, reap the child process.
printf("PARENT: Waking up and reaping the zombie.\n");
wait(NULL);
printf("PARENT: Zombie reaped. Check again, it should be gone.\n");
sleep(5);
} else if (child_pid == 0) {
// CHILD process
printf("CHILD: I am the child, and I am exiting now.\n");
exit(0);
} else {
// fork failed
perror("fork");
exit(1);
}
return 0;
}
Build and Run Steps:
1. Compile the code:
gcc zombie_maker.c -o zombie_maker
2. Run it:
./zombie_maker
3. Quickly, in another terminal, run the ps
command while the parent is sleeping:
ps aux | grep zombie_maker
Expected Output (from the ps
command):
pi 3456 0.0 0.0 4500 800 pts/0 S+ 06:30 0:00 ./zombie_maker
pi 3457 0.0 0.0 0 0 pts/0 Z+ 06:30 0:00 [zombie_maker] <defunct>
You will see two entries. The first is the parent process in a sleeping state (‘S’). The second is the child process (PID 3457), clearly marked with a ‘Z’ state and labeled as <defunct>
. This is the zombie. After 20 seconds, the parent process in the first terminal will wake up, call wait()
, and the zombie process will disappear if you run the ps
command again.
Common Mistakes & Troubleshooting
Developing programs that manage processes introduces a new class of potential bugs. Understanding these common pitfalls can save hours of debugging time.
Exercises
- Process Tree Explorer:
- Objective: Use the
pstree
command to visualize the process hierarchy on your Raspberry Pi 5. - Steps:
- Install the tool if it’s not already present:
sudo apt update && sudo apt install psmisc
. - Run
pstree
. - Run
pstree -p
to include PIDs in the output. - Open two separate terminal windows. In one, run a long-running command like
ping google.com
. In the other, runpstree -p | grep ping
to find theping
process and identify its parent shell.
- Install the tool if it’s not already present:
- Verification: Confirm that the PPID of the
ping
process shown byps
matches the PID of thebash
process shown in thepstree
output.
- Objective: Use the
- Simple Shell in C:
- Objective: Write a basic C program that mimics a shell. It should display a prompt, read a line of user input, and then use
fork()
andexecvp()
to execute the command entered by the user. - Guidance:
- Use a loop to continuously display a prompt (e.g.,
"> "
) and read input.fgets()
is a good choice for reading a line. - You will need to parse the input string to separate the command from its arguments.
strtok()
can be useful here. - Inside the loop, call
fork()
. - The child process should use
execvp()
, which is convenient as it takes an array of strings (the command and its arguments) and searches thePATH
. - The parent process should call
waitpid()
to wait for the child to finish before looping back to the prompt. - Add a way to exit, e.g., by typing “exit”.
- Use a loop to continuously display a prompt (e.g.,
- Verification: Your program should be able to run simple commands like
ls -l
,whoami
, andpwd
.
- Objective: Write a basic C program that mimics a shell. It should display a prompt, read a line of user input, and then use
- Orphan Process Demonstration:
- Objective: Write a C program to demonstrate what happens when a parent process exits before its child.
- Steps:
- Write a program that calls
fork()
. - In the parent’s code block, print its PID and a message saying it is exiting, then
exit()
immediately. - In the child’s code block, print its initial PID and PPID. Then, make it
sleep()
for 5 seconds. After waking up, have it print its PID and PPID again.
- Write a program that calls
- Verification: Observe the output. The child’s PPID should change from the original parent’s PID to 1, indicating it has been adopted by the
init
/systemd
process.
- Process State Monitoring with a Shell Script:
- Objective: Write a shell script that monitors a specific process and reports when its state changes.
- Guidance:
- The script should take a PID as a command-line argument.
- Use a
while
loop that continues as long as the process directory/proc/$PID
exists. - Inside the loop, extract the
State
line from/proc/$PID/status
usinggrep
orawk
. - Store the current state in a variable. On each iteration, compare the new state to the previous state. If it has changed, print a message (e.g., “Process $PID changed from Sleeping to Running”).
- Use
sleep 1
inside the loop to avoid overwhelming the CPU.
- Verification: Run a process like
sleep 20 &
to get its PID. Run your script with that PID. In another terminal, stop and continue the sleep process usingkill -STOP <PID>
andkill -CONT <PID>
and watch your script report the state changes from ‘S’ to ‘T’ and back.
Summary
- A program is a passive file on disk, while a process is an active instance of a program in execution.
- The kernel identifies every process with a unique Process ID (PID) and tracks its creator via the Parent Process ID (PPID).
- The
init
process (PID 1) is the ancestor of all user-space processes on a Linux system. - The
fork()
system call creates a new child process, and theexec()
family of calls replaces the current process’s image with a new program. Thisfork-exec
pattern is fundamental to command execution. - Processes transition between several states: Runnable (R), Interruptible Sleep (S), Uninterruptible Sleep (D), Stopped (T), and Zombie (Z).
- A zombie process is a terminated child that has not yet been cleaned up (reaped) by its parent using
wait()
orwaitpid()
. Failing to reap children is a common bug that leads to resource leaks. - Command-line tools like
ps
,pstree
, and the/proc
virtual filesystem are essential for inspecting and monitoring process status and relationships.
Further Reading
- Linux Kernel Documentation – The Process Abstraction: (Located within the kernel source tree) – The ultimate source for how the kernel implements and manages processes. https://www.kernel.org/doc/html/v4.18/process/4.Coding.html
proc(5)
Manual Page: Runman 5 proc
on your Linux system. This is the official documentation for the/proc
filesystem, detailing all the information you can extract from it.- “The Linux Programming Interface” by Michael Kerrisk: Chapters 24-28 provide an exhaustive and authoritative look at processes, process creation, termination, and signals.
- “Advanced Programming in the UNIX Environment” by W. Richard Stevens and Stephen A. Rago: A classic text whose chapters on process control are still highly relevant and provide deep insights.
- LWN.net – Process Management Articles: A highly respected online publication that often features in-depth articles on kernel subsystems, including the process scheduler and manager. Search their archives for relevant topics. https://lwn.net
- Raspberry Pi Foundation Documentation: Official guides on using the Raspberry Pi, which provide context for the hardware platform. https://www.raspberrypi.com/documentation/
systemd
Documentation: For systems usingsystemd
as the init process (PID 1), understanding its role in managing services and the overall process tree is crucial.