Chapter 28: C Programming Refresher: Dynamic Memory Allocation in C
Chapter Objectives
By the end of this chapter, you will be able to:
- Understand the fundamental differences between stack and heap memory allocation in a C program.
- Implement dynamic memory allocation and deallocation using
malloc()
,calloc()
, andfree()
. - Modify the size of existing memory blocks at runtime using
realloc()
. - Analyze C programs for memory leaks and other heap-related errors using tools like Valgrind.
- Apply dynamic memory management techniques to solve common problems in embedded systems programming.
- Recognize and prevent common pitfalls such as dangling pointers, buffer overflows, and memory fragmentation.
Introduction
In the world of embedded Linux development, particularly on resource-aware platforms like the Raspberry Pi 5, efficient and precise control over system memory is not just a feature—it is a fundamental requirement. While previous chapters may have focused on static memory allocation, where the size and lifetime of variables are determined at compile time, many real-world applications demand a more flexible approach. Imagine an application that needs to process data from a network socket, read a file of unknown size, or manage a data structure that grows and shrinks based on sensor input. In these scenarios, we cannot know the exact memory requirements beforehand. This is where dynamic memory allocation becomes indispensable.
This chapter serves as a crucial refresher on the principles of managing memory from the heap in the C programming language. The heap is a segment of memory available to a program at runtime, from which we can request and release blocks of memory as needed. This capability grants our programs immense flexibility, allowing them to adapt to changing data loads and operate efficiently without wasting precious RAM. However, this power comes with significant responsibility. Unlike stack memory, which is automatically managed by the compiler, heap memory must be managed manually by the programmer. Failure to do so correctly can lead to critical bugs such as memory leaks, where unused memory is never returned to the system, eventually causing the application or the entire system to fail.
We will explore the standard library functions that serve as our tools for this task: malloc()
, calloc()
, realloc()
, and free()
. We will move beyond simple definitions to understand how they work under the hood, their performance implications, and the common patterns and pitfalls associated with their use. By the end of this chapter, you will not only be able to write C code that uses dynamic memory but also understand how to do so safely and efficiently, a skill that is paramount for building robust and reliable embedded systems on your Raspberry Pi 5.
Technical Background
To truly master dynamic memory allocation, one must first understand how a typical C program organizes memory. When your compiled application is loaded by the Linux kernel to run, it is granted a virtual address space. This space is partitioned into several distinct segments, each with a specific purpose. The most relevant to our discussion are the text segment (for the executable code), the data segment (for initialized global and static variables), the BSS segment (for uninitialized global and static variables), the stack, and the heap.
The stack is a region of memory that operates in a Last-In, First-Out (LIFO) manner. It is used for storing local variables, function parameters, and return addresses. Every time a function is called, a new “stack frame” is pushed onto the stack to hold its local variables. When the function returns, its stack frame is popped off. This process is automatic, fast, and managed entirely by the compiler. The size of the stack is generally fixed when the program starts. This rigid, compile-time nature makes it unsuitable for data whose size is unknown until runtime.
This limitation brings us to the heap. The heap is a large, unstructured pool of memory that is managed by the programmer, not the compiler. It is the region from which we dynamically request memory when our program is running. Unlike the stack’s orderly LIFO structure, the heap is more like an open field where we can request plots of land (memory blocks) of varying sizes. This allocation is not free; it is computationally more expensive than stack allocation because it involves searching for a suitable block of memory. Most importantly, the programmer is responsible for explicitly returning this memory to the system when it is no longer needed. This manual process is the primary source of both the power and the peril of dynamic memory management.
The Core Allocation Functions: malloc
and calloc
The primary tool for requesting memory from the heap is the malloc()
function, which is declared in the <stdlib.h>
header. Its name is an abbreviation for “memory allocation.” The function prototype is void* malloc(size_t size);
. It takes a single argument: the number of bytes of memory you wish to allocate. If the request is successful, malloc()
returns a void
pointer to the first byte of the allocated block. A void
pointer is a generic pointer that can be cast to any other pointer type. If the system cannot fulfill the request—for example, if the heap is out of memory—malloc()
returns NULL
.
It is absolutely critical to always check the return value of malloc()
for NULL
. Attempting to dereference a NULL
pointer will result in a segmentation fault, crashing your program. The memory block returned by malloc()
is not initialized; it contains whatever garbage data was previously in that memory location.
A close relative of malloc()
is calloc()
, which stands for “contiguous allocation.” Its prototype is void* calloc(size_t num, size_t size);
. It takes two arguments: the number of elements to allocate and the size of each element in bytes. The total allocated size is num * size
. The key difference from malloc()
is that calloc()
initializes the allocated memory to zero. This can be a useful security feature, preventing the accidental use of sensitive data left over from previous operations, and it can simplify logic by ensuring a known initial state. This initialization comes with a slight performance cost compared to malloc()
, but in many embedded contexts, the safety and predictability it offers are well worth it. Like malloc()
, calloc()
returns a void
pointer on success and NULL
on failure.
Releasing Memory: The free
Function
Every block of memory allocated with malloc()
or calloc()
must eventually be returned to the heap for reuse. This is accomplished using the free()
function. Its prototype is void free(void* ptr);
. It takes a single argument: the pointer that was returned by the allocation function. Calling free()
on a pointer tells the memory manager that the block of memory this pointer points to is no longer in use.
Failing to call free()
for every malloc()
or calloc()
results in a memory leak. The leaked memory remains marked as “in use” for the lifetime of the program, even though it is inaccessible. In a long-running embedded application, such as a server or a monitoring device, even a small, repeated memory leak can accumulate over time, eventually consuming all available RAM and causing the system to fail.
Conversely, a more immediate and often catastrophic error is the dangling pointer. This occurs when you call free()
on a pointer but then attempt to use that pointer again. The pointer still holds the address of the now-deallocated memory, but that memory region could have been re-allocated for another purpose. Writing to a dangling pointer can corrupt unrelated data structures, while reading from it can yield unpredictable results. A common best practice is to set a pointer to NULL
immediately after freeing it, which prevents its accidental reuse.
graph TD subgraph Heap State Over Time direction TB A[Start: Empty Heap] --> B{"Call: malloc(16 bytes)"}; B --> C["Heap: <br>| <b>Block A (16B)</b> | Free Space |"]; C --> D{"Call: malloc(24 bytes)"}; D --> E["Heap: <br>| <b>Block A (16B)</b> | <b>Block B (24B)</b> | Free... |"]; E --> F{"Call: free(Block A)"}; F --> G["Heap: <br>| <i>Hole (16B)</i> | <b>Block B (24B)</b> | Free... |"]; G --> H{"Call: malloc(8 bytes)"}; H --> I["Heap: <br>| <b>Block C (8B)</b> | <i>Hole (8B)</i> | <b>Block B (24B)</b> | Free... |"]; I --> J[End State]; end %% Styling classDef startNode fill:#1e3a8a,stroke:#1e3a8a,stroke-width:2px,color:#ffffff classDef processNode fill:#0d9488,stroke:#0d9488,stroke-width:1px,color:#ffffff classDef stateNode fill:#f8fafc,stroke:#64748b,stroke-width:1px,color:#1f2937 classDef endNode fill:#10b981,stroke:#10b981,stroke-width:2px,color:#ffffff class A,J startNode; class B,D,F,H processNode; class C,E,G,I stateNode;
Resizing Allocations: The realloc
Function
Often, the initial memory estimate for a task turns out to be insufficient or excessive. For example, you might allocate a buffer to read data from a file, only to find the file is larger than expected. Instead of allocating a new, larger block, copying the old data, and freeing the old block, you can use realloc()
. The realloc()
function attempts to resize an existing memory allocation. Its prototype is void* realloc(void* ptr, size_t new_size);
.
It takes two arguments: the pointer to the existing memory block and the desired new size in bytes. The behavior of realloc()
is nuanced:
- Shrinking: If the
new_size
is smaller than the original size, the block is truncated. The contents of the block up to the new size are preserved. - Growing (In-Place): If the
new_size
is larger,realloc()
first checks if there is enough free space immediately following the current block. If so, it expands the block in-place and returns the same pointer that was passed in. - Growing (Moving): If there is not enough contiguous space,
realloc()
will find a new, larger block of memory elsewhere on the heap, copy the contents from the old block to the new one, free the old block, and return a pointer to the new block. - Failure: If
realloc()
cannot find a large enough block anywhere, it returnsNULL
, and the original memory block is left untouched.
This last point is critical. A common mistake is to assign the result of realloc()
directly back to the original pointer, like this: ptr = realloc(ptr, new_size);
. If realloc()
fails, it returns NULL
, overwriting ptr
. Now you have not only failed to get the new memory, but you have also lost your only pointer to the original data, creating a memory leak. The correct pattern is to use a temporary pointer:
void* temp_ptr = realloc(ptr, new_size);
if (temp_ptr == NULL) {
// Handle reallocation failure, ptr is still valid
free(ptr); // Or try to continue with the old block
} else {
ptr = temp_ptr; // Success, update the original pointer
}
The Shadowy World of Fragmentation
The dynamic nature of the heap, with its continuous cycle of allocation and deallocation of variably sized blocks, leads to a problem known as fragmentation. There are two types.
External fragmentation occurs when the available free memory is broken into many small, non-contiguous blocks. You might have enough total free memory to satisfy a large malloc()
request, but no single block is large enough. The memory manager’s job is to minimize this, often by coalescing adjacent free blocks into larger ones, but it’s an inherent challenge of heap management.
Internal fragmentation occurs within allocated blocks. Memory allocators often work with chunks of a certain minimum size or alignment for efficiency. If you request 3 bytes, the allocator might give you a 16-byte chunk. The extra 13 bytes are “internal” to your allocation but are wasted space. This is a trade-off between memory utilization and the speed of the allocation algorithm.
In an embedded system, where memory is a finite and often scarce resource, understanding fragmentation is vital. A poorly designed memory allocation strategy can lead to premature “out of memory” errors, even when technically there should be enough RAM available. This is why predictable memory usage patterns are often favored in critical embedded code.
Practical Examples
Theory is essential, but proficiency comes from practice. In this section, we will walk through several hands-on examples that demonstrate dynamic memory allocation on your Raspberry Pi 5. We will use the standard GCC compiler and the Valgrind tool for debugging.
Tip: Before proceeding, ensure you have the necessary build tools installed on your Raspberry Pi OS. Open a terminal and run:
sudo apt-get update && sudo apt-get install build-essential valgrind
.
Example 1: Basic Allocation with malloc
and free
Our first example is a simple program that asks the user how many integers they want to store, allocates the necessary memory, fills it with values, prints them, and then cleans up.
File: simple_malloc.c
#include <stdio.h>
#include <stdlib.h>
int main() {
int n;
int *arr = NULL; // Always initialize pointers to NULL
// 1. Get the number of elements from the user
printf("Enter the number of integers to store: ");
scanf("%d", &n);
// Defensive check for invalid input
if (n <= 0) {
printf("Invalid number of elements.\n");
return 1;
}
// 2. Allocate memory from the heap
// We need space for 'n' integers. sizeof(int) ensures portability.
printf("Allocating memory for %d integers...\n", n);
arr = (int*)malloc(n * sizeof(int));
// 3. CRITICAL: Check if malloc was successful
if (arr == NULL) {
fprintf(stderr, "Error: Memory allocation failed!\n");
return 1; // Exit with an error code
}
// 4. Use the allocated memory
printf("Memory allocated successfully. Populating array.\n");
for (int i = 0; i < n; i++) {
arr[i] = i * 10; // Fill with some data
}
// 5. Print the data to verify
printf("Array contents:\n");
for (int i = 0; i < n; i++) {
printf("arr[%d] = %d\n", i, arr[i]);
}
// 6. Release the memory back to the heap
printf("Freeing allocated memory...\n");
free(arr);
arr = NULL; // Good practice: prevent dangling pointer
printf("Program finished successfully.\n");
return 0;
}
Build and Run Steps:
- Save the code above into a file named
simple_malloc.c
. - Open a terminal on your Raspberry Pi 5.
- Compile the program using GCC:
gcc -Wall -Wextra -g -o simple_malloc simple_malloc.c
-Wall -Wextra
: Enables all major warnings. Essential for catching potential bugs.-g
: Includes debugging information, which is useful for tools like GDB and Valgrind.-o simple_malloc
: Specifies the name of the output executable file.
- Run the executable:
./simple_malloc
Expected Output:
Enter the number of integers to store: 5
Allocating memory for 5 integers...
Memory allocated successfully. Populating array.
Array contents:
arr[0] = 0
arr[1] = 10
arr[2] = 20
arr[3] = 30
arr[4] = 40
Freeing allocated memory...
Program finished successfully.
Example 2: Using calloc
and Detecting Leaks with Valgrind
Now, let’s modify the previous example to use calloc
and intentionally introduce a memory leak to see how Valgrind can help us find it.
File: leak_detector.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void create_leak() {
char *leaky_string;
printf("Inside create_leak function.\n");
// Allocate memory for a string using calloc
// calloc will initialize this memory to all zeros.
leaky_string = (char*)calloc(50, sizeof(char));
if (leaky_string == NULL) {
fprintf(stderr, "calloc failed!\n");
return;
}
strcpy(leaky_string, "This memory will be leaked.");
printf("String created: '%s'\n", leaky_string);
// MISTAKE: We forget to call free(leaky_string) before the function returns.
printf("Exiting create_leak without freeing memory.\n");
}
int main() {
printf("Starting main program.\n");
create_leak();
printf("Main program finished. A leak has occurred.\n");
return 0;
}
Build and Analysis Steps:
- Save the code as
leak_detector.c
. - Compile it with debugging symbols:
gcc -g -o leak_detector leak_detector.c
- Run the program under Valgrind’s
memcheck
tool:valgrind --leak-check=full ./leak_detector
Valgrind Output Explanation:
Valgrind will run your program and produce a detailed report. The output will be verbose, but the most important part is the LEAK SUMMARY.
==12345== HEAP SUMMARY:
==12345== in use at exit: 50 bytes in 1 blocks
==12345== total heap usage: 2 allocs, 1 frees, 1,074 bytes allocated
==12345==
==12345== 50 bytes in 1 blocks are definitely lost in loss record 1 of 1
==12345== at 0x483DD99: calloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==12345== by 0x1096C: create_leak (leak_detector.c:12)
==12345== by 0x109B8: main (leak_detector.c:23)
==12345==
==12345== LEAK SUMMARY:
==12345== definitely lost: 50 bytes in 1 blocks
==12345== indirectly lost: 0 bytes in 0 blocks
==12345== possibly lost: 0 bytes in 0 blocks
==12345== still reachable: 0 bytes in 0 blocks
==12345== suppressed: 0 bytes in 0 blocks
Valgrind tells us exactly what happened:
50 bytes in 1 blocks are definitely lost
. This is our leak.- It points to the exact line of code where the allocation occurred:
leak_detector.c:12
, inside thecreate_leak
function.
To fix this, simply add free(leaky_string);
before the function returns. Re-compile and run with Valgrind again, and the leak summary will report 0 bytes in 0 blocks
.
Example 3: Dynamic Array Resizing with realloc
This example simulates reading an unknown amount of data (e.g., from a sensor). We start with a small buffer and use realloc
to grow it as more data “arrives”.
File: resizer.c
#include <stdio.h>
#include <stdlib.h>
int main() {
int *data = NULL;
size_t capacity = 2; // Start with a small capacity
size_t count = 0; // Number of elements currently stored
// 1. Initial allocation
data = (int*)malloc(capacity * sizeof(int));
if (data == NULL) {
fprintf(stderr, "Initial allocation failed!\n");
return 1;
}
printf("Initial allocation: capacity = %zu\n", capacity);
// 2. Simulate reading data in a loop
for (int i = 0; i < 10; i++) {
// Check if we need to grow the array
if (count == capacity) {
size_t new_capacity = capacity * 2; // Double the capacity
printf("Capacity reached. Resizing from %zu to %zu...\n", capacity, new_capacity);
// Use a temporary pointer for the realloc call
int *temp_data = (int*)realloc(data, new_capacity * sizeof(int));
// Check if realloc succeeded
if (temp_data == NULL) {
fprintf(stderr, "Error: Memory reallocation failed!\n");
free(data); // Free the original block
return 1;
}
// If successful, update the main pointer and capacity
data = temp_data;
capacity = new_capacity;
}
// Add the new "sensor reading"
data[count] = 100 + i;
count++;
printf("Added element %d. Current count: %zu\n", data[count-1], count);
}
// 3. Print final results
printf("\n--- Final Data ---\n");
printf("Total elements: %zu, Final capacity: %zu\n", count, capacity);
for (size_t i = 0; i < count; i++) {
printf("data[%zu] = %d\n", i, data[i]);
}
// 4. Clean up
free(data);
data = NULL;
return 0;
}
Build and Run:
gcc -Wall -g -o resizer resizer.c
./resizer
Expected Output:
Initial allocation: capacity = 2
Added element 100. Current count: 1
Added element 101. Current count: 2
Capacity reached. Resizing from 2 to 4...
Added element 102. Current count: 3
Added element 103. Current count: 4
Capacity reached. Resizing from 4 to 8...
Added element 104. Current count: 5
Added element 105. Current count: 6
Added element 106. Current count: 7
Added element 107. Current count: 8
Capacity reached. Resizing from 8 to 16...
Added element 108. Current count: 9
Added element 109. Current count: 10
--- Final Data ---
Total elements: 10, Final capacity: 16
data[0] = 100
data[1] = 101
...
data[9] = 109
This example perfectly illustrates the “safe realloc
” pattern and demonstrates how to build a data structure that can adapt to an unknown workload, a common requirement in embedded systems.
Common Mistakes & Troubleshooting
The manual nature of heap management in C is a fertile ground for bugs. Understanding the common pitfalls is the first step toward avoiding them.
Exercises
These exercises are designed to reinforce the concepts of this chapter. Attempt to solve them on your Raspberry Pi 5.
- Dynamic String Concatenation:
- Objective: Write a function
char* concatenate(const char* s1, const char* s2);
. - Requirements: This function should take two C strings as input. It must dynamically allocate enough memory on the heap to hold the concatenated result (including the null terminator). It should then copy the contents of
s1
followed bys2
into this new buffer and return a pointer to it. The calling function is responsible for freeing this memory. - Verification: Write a
main
function that callsconcatenate
, prints the result, and then uses Valgrind to ensure there are no memory leaks.
- Objective: Write a function
- Reading a File into a Dynamic Buffer:
- Objective: Write a program that reads the entire contents of a text file into a single dynamically allocated string.
- Requirements: The program should take a filename as a command-line argument. You cannot assume the file’s size beforehand. Start by allocating a small buffer. Read a chunk of the file into the buffer. If you reach the end of the buffer but not the end of the file, use
realloc
to grow your buffer (doubling its size is a good strategy), and continue reading. - Verification: Create a test text file. Run your program and have it print the string it read. Check with Valgrind to ensure all memory is freed correctly, especially in error cases (e.g., file not found).
- Implement a Dynamic Stack Data Structure:
- Objective: Create a simple stack implementation for integers using a dynamically sized array.
- Requirements:
- Create a
struct Stack
containing a pointer to the data (int *items
), the current number of items (top
), and the allocated capacity (capacity
). - Implement
create_stack(capacity)
tomalloc
the struct and its initial data array. - Implement
push(stack, item)
which adds an item. If the stack is full, it should userealloc
to double the capacity. - Implement
pop(stack)
to remove and return the top item. - Implement
destroy_stack(stack)
tofree
the data array and the stack struct itself.
- Create a
- Verification: Write a
main
function that creates a stack, pushes more items onto it than its initial capacity to trigger a resize, pops them all off, and then destroys the stack. Verify with Valgrind.
Summary
This chapter provided a vital refresher on dynamic memory management in C, a critical skill for any embedded Linux developer. We have moved from theory to practice, establishing a solid foundation for writing flexible and robust applications.
- Stack vs. Heap: We contrasted the fast, automatic, compile-time nature of stack memory with the flexible, manual, run-time nature of the heap.
- Core Functions: We detailed the use of
malloc()
for raw allocation,calloc()
for zero-initialized allocation,realloc()
for resizing existing blocks, and the all-importantfree()
for returning memory to the system. - Error Handling: The importance of checking for
NULL
return values from allocation functions was stressed, as was the “saferealloc
” pattern using a temporary pointer. - Common Pitfalls: We identified and provided solutions for the most common memory management bugs: memory leaks, dangling pointers, and buffer overflows.
- Debugging with Valgrind: We demonstrated how to use the Valgrind
memcheck
tool to automatically detect memory leaks and invalid memory access, proving it to be an indispensable part of the development workflow.
By mastering these concepts, you are now better equipped to write C programs for the Raspberry Pi 5 that can handle data of unknown sizes and adapt to changing conditions, all while maintaining the stability and reliability required of an embedded system.
Further Reading
- The C Standard Library (
stdlib.h
) Documentation: The definitive source for the behavior ofmalloc
,calloc
,realloc
, andfree
. The Linuxman
pages are an excellent, practical reference. (man malloc
) - Valgrind User Manual: The official documentation for the Valgrind tool suite. It provides in-depth explanations of all its features, including
memcheck
. (https://valgrind.org/docs/manual/manual.html) - “Understanding Memory Management in C” by Embedded.com: A well-regarded article that provides a clear overview of the concepts discussed in this chapter from an embedded systems perspective. (https://www.embedded.com/understanding-memory-management-in-c/)
- “C Programming: A Modern Approach” by K.N. King: A highly respected textbook that offers clear, comprehensive explanations of C language features, including an excellent chapter on dynamic memory management.
- Beej’s Guide to C Programming: An accessible and popular online guide that covers all aspects of C programming, with a very clear section on memory allocation. (https://beej.us/guide/bgc/)
- Dietmar Kühl’s Stack Overflow Answer on
realloc
: A famously detailed and accurate explanation of the correct and safe way to userealloc
. (Search for “Correct usage of realloc” on Stack Overflow).