Chapter 54: Advanced File I/O: Memory-Mapped Files (mmap
, munmap
)
Chapter Objectives
Upon completing this chapter, you will be able to:
- Understand the fundamental concepts of virtual memory and how memory-mapped I/O functions within the Linux kernel.
- Explain the differences between traditional file I/O (
read
/write
) and memory-mapped I/O (mmap
), and identify scenarios wheremmap
is the superior choice. - Implement C programs on a Raspberry Pi 5 that use
mmap
to map files into a process’s address space for both reading and writing. - Configure memory mappings using different protection (
PROT_*
) and visibility (MAP_*
) flags to control access and sharing behavior. - Debug common issues related to memory-mapped files, such as segmentation faults, bus errors, and synchronization problems.
- Apply memory-mapped techniques to solve practical data-sharing and persistence problems in embedded systems.
Introduction
Performance is paramount in embedded Linux. Systems are often constrained by processing power and memory, yet are required to handle large volumes of data efficiently. Traditional file I/O, which relies on the read()
and write()
system calls, has served as the bedrock of data handling for decades. This model, however, introduces a layer of overhead. Data must be copied from the kernel’s page cache into a user-space buffer for a read()
, and back again for a write()
. For large files or performance-critical applications, this constant data movement between kernel and user space can become a significant bottleneck.
This chapter introduces a more elegant and powerful alternative: memory-mapped I/O. Using the mmap()
system call, a process can ask the kernel to map a file directly into its virtual address space. Once mapped, the file can be accessed just like an array in memory, using simple pointer arithmetic. The tedious cycle of read()
, write()
, and lseek()
is replaced by direct memory access. This is not merely a convenience; it is a fundamental shift in how we interact with data. The kernel handles the loading of file pages into physical memory on-demand, a process known as demand paging. This lazy loading mechanism, combined with the elimination of explicit data copies, can yield dramatic performance improvements.
sequenceDiagram actor User as User Application participant Kernel participant Disk %% Traditional Read Path rect rgb(240, 240, 240) Note over User,Disk: Traditional I/O: read() User->>Kernel: read(fd, buffer, count) activate Kernel Note over Kernel,Disk: Kernel copies data from<br/>Disk to Page Cache Kernel->>Disk: Request data block Disk->>Kernel: Return data block Note over User,Kernel: Double Copy:<br/>1. Disk to Kernel Cache<br/>2. Kernel Cache to User Buffer Kernel->>User: Copy data to user buffer deactivate Kernel end %% mmap Path rect rgb(230, 245, 255) Note over User,Disk: Memory-Mapped I/O: mmap() User->>Kernel: mmap(fd, ...) activate Kernel Note over Kernel: Kernel sets up page table entries<br/>No data is copied yet Kernel->>User: Return pointer to mapped region deactivate Kernel User->>User: Access *ptr Note over User,Disk: Page Fault on first access<br/>Kernel handles loading from disk directly<br/>into a shared frame. No extra copy to user space activate Kernel Kernel->>Disk: Request data block (on-demand) Disk->>Kernel: Return data block to page cache deactivate Kernel end
Real-world applications of mmap
are widespread and impactful. High-performance databases use it to manage large data files, dynamic linkers use it to load shared libraries into memory, and scientific applications use it to process massive datasets that would otherwise not fit in physical RAM. In this chapter, you will move beyond the theory and gain hands-on experience. Using your Raspberry Pi 5, you will learn to map files, manipulate their contents directly in memory, and understand the subtle but crucial differences between shared and private mappings. By the end, you will have a powerful new tool in your system programming arsenal, enabling you to build more efficient and sophisticated embedded applications.
Technical Background
To truly appreciate the power and elegance of mmap
, one must first have a solid understanding of the virtual memory system that underpins modern operating systems like Linux. Every process running on the system operates within its own private, virtual address space, a conceptual sandbox that isolates it from other processes and the kernel itself. This address space is a contiguous range of memory addresses, typically from zero up to a very large number, that the process can use. However, these virtual addresses are not the same as the physical addresses of the RAM chips in the computer.
The magic of translation is handled by a collaboration between the operating system’s kernel and the CPU’s Memory Management Unit (MMU). The MMU is a piece of hardware that translates virtual addresses generated by the CPU into physical addresses in RAM. The kernel maintains a set of tables, called page tables, for each process. These tables store the mappings between the process’s virtual pages (chunks of virtual memory) and the physical frames (chunks of physical RAM) they correspond to. When a process accesses a memory location, the MMU uses these page tables to find the correct physical location.
This architecture is what makes mmap
possible. The mmap()
system call is essentially a request to the kernel to create a new mapping in the calling process’s page tables. Instead of mapping a virtual page to an anonymous frame of physical RAM (as is the case for normal program memory allocated by malloc
), mmap
maps it to a specific portion of a file on disk.
When a process first calls mmap
to map a file, the kernel does not immediately read the entire file into memory. It simply sets up the necessary virtual memory structures and updates the process’s page tables to reflect the new mapping. The actual loading of data is deferred until the process attempts to access a memory address within the mapped region. The first time this happens, the MMU will find no valid physical memory mapping for that virtual address and will trigger a hardware exception called a page fault.
This fault is not an error in the traditional sense. It is a signal to the kernel that it needs to intervene. The kernel’s page fault handler inspects the address that caused the fault, determines that it belongs to a memory-mapped region, and identifies which part of the file corresponds to the requested page. It then allocates a physical frame of RAM, reads the relevant data from the file on disk into that frame, and updates the process’s page table to map the virtual page to the newly loaded physical frame. Finally, it resumes the process, which can now access the memory location as if it had been in RAM all along. This entire process is transparent to the application and is known as demand paging.
flowchart TD subgraph Process Execution A["Start: Process calls mmap()"] B{Access memory in<br>mapped region?} C[Pointer dereference: `*ptr`] end subgraph Kernel & MMU Interaction D{Page in<br>Physical RAM?} E[MMU triggers<br><b>Page Fault</b>] F[Kernel Page Fault Handler] G[Find VMA for address] H[Locate corresponding<br>block in file] I[Read file block<br>from disk into a<br>free RAM page] J[Update Page Table:<br>Map virtual page to<br>new physical RAM page] end subgraph Result K[Access Granted:<br>Data is returned to process] L[Resume Process Execution] end A --> B; B -- No --> B; B -- Yes --> C; C --> D; D -- Yes --> K; D -- No --> E; E --> F; F --> G; G --> H; H --> I; I --> J; J --> L; K --> L; L --> B; %% Styling classDef primary fill:#1e3a8a,stroke:#1e3a8a,stroke-width:2px,color:#ffffff; classDef success fill:#10b981,stroke:#10b981,stroke-width:2px,color:#ffffff; classDef decision fill:#f59e0b,stroke:#f59e0b,stroke-width:1px,color:#ffffff; classDef process fill:#0d9488,stroke:#0d9488,stroke-width:1px,color:#ffffff; classDef check fill:#ef4444,stroke:#ef4444,stroke-width:1px,color:#ffffff; classDef kernel fill:#8b5cf6,stroke:#8b5cf6,stroke-width:1px,color:#ffffff; class A primary; class B,D decision; class C,I,J,L process; class E check; class F,G,H kernel; class K success;
The mmap
System Call
The mmap
system call is the core of this mechanism. Its prototype, found in <sys/mman.h>
, is as follows:
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
Let’s break down these arguments in detail.
void *addr
: This argument is a hint to the kernel about where to place the mapping in the virtual address space. In modern practice, you should almost always passNULL
for this argument. This lets the kernel choose a suitable, available address, which is far more portable and reliable than trying to manage the address layout yourself.size_t length
: This specifies the number of bytes to be mapped, starting fromoffset
. It does not need to be the entire file, allowing you to map just a specific segment of a larger file.int prot
: This argument controls the memory protection of the mapping and is crucial for security and correctness. It is a bitmask created by OR-ing together several flags:PROT_READ
: The pages can be read.PROT_WRITE
: The pages can be written. Attempting to write to a mapping without this flag will result in a segmentation fault.PROT_EXEC
: The pages can be executed. This is used by dynamic loaders for shared libraries but should be used with extreme caution in general applications due to security implications (e.g., enabling buffer overflow attacks).PROT_NONE
: The pages cannot be accessed at all.
int flags
: This argument specifies the type of mapping and other options. The most important choice is betweenMAP_SHARED
andMAP_PRIVATE
.MAP_SHARED
: This is the key to sharing data. If a process writes to a region mapped withMAP_SHARED
, the modification is carried back to the underlying file on disk. Furthermore, this change becomes visible to any other process that has also mapped the same file withMAP_SHARED
. This is a highly efficient mechanism for inter-process communication (IPC).MAP_PRIVATE
: This creates a copy-on-write (COW) mapping. When the process reads from the mapping, it sees the file’s contents. However, the first time the process attempts to write to the mapping, the kernel intercepts the action. It creates a private copy of the modified page in RAM, and the process’s page table is updated to point to this new private copy. All subsequent writes go to this private copy. The original file on disk is never changed, and these changes are not visible to any other process. This is useful when you want to work with a file’s data as a starting template without modifying the original.
int fd
: This is the file descriptor of the open file you wish to map. The file must be opened with permissions compatible with theprot
flags. For example, to create a writable mapping (PROT_WRITE
), the file must have been opened with write permissions (e.g.,O_RDWR
).off_t offset
: This is the starting offset in the file from where the mapping should begin. A critical requirement is that this offset must be a multiple of the system’s page size. The page size is a fundamental unit of memory management, and on most systems, including the Raspberry Pi, it is 4096 bytes (4 KiB). You can retrieve this value programmatically usingsysconf(_SC_PAGESIZE)
.
As a summary:
Upon success, mmap
returns a pointer to the start of the mapped memory region. On failure, it returns MAP_FAILED
, which is a macro for (void *) -1
, and errno
is set to indicate the error.
Cleaning Up with munmap
and msync
A mapping created with mmap
persists until the process terminates or until it is explicitly removed with the munmap()
system call. It is essential to clean up mappings to release the virtual address space they occupy. The prototype is simple:
int munmap(void *addr, size_t length);
Here, addr
is the starting address returned by mmap
, and length
is the size of the mapping. A common source of bugs is a mismatch between the length
provided to mmap
and munmap
. It’s safest to always unmap the exact same size that was originally mapped.
For MAP_SHARED
mappings, modifications made to the memory region are not guaranteed to be written to the underlying file immediately. The kernel may buffer these changes in memory for efficiency. To explicitly control when data is written to disk, you can use the msync()
system call:
int msync(void *addr, size_t length, int flags);
The flags
argument controls the synchronization behavior:
Using msync
is critical in applications where data integrity is paramount, such as a database system. It ensures that after a successful msync
call, the data is safely stored on the persistent medium, protecting it against a subsequent system crash or power failure.
Practical Examples
The following examples are designed to be compiled and run on a Raspberry Pi 5 running Raspberry Pi OS or a similar Linux distribution. You will need the standard C development tools (gcc
, make
).
Example 1: Basic File Editing with mmap
This first example demonstrates the fundamental use of mmap
to read from and write to a file. We will create a text file, map it into memory, modify its contents via a pointer, and then verify that the underlying file has changed. This showcases the power of MAP_SHARED
.
Build and Configuration Steps
1. Create a test file. On your Raspberry Pi’s terminal, create a simple text file that we will manipulate.
echo "Hello Embedded Linux World!" > mmap_test.txt
2. Create the C source file. Using a text editor like nano
or vim
, create a file named mmap_editor.c
.
nano mmap_editor.c
Code Snippet
Copy and paste the following C code into mmap_editor.c
. The comments explain each step of the process.
// mmap_editor.c
// A simple program to demonstrate file editing using mmap.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h> // For O_RDWR
#include <unistd.h> // For open(), close(), ftruncate()
#include <sys/mman.h> // For mmap(), munmap()
#include <sys/stat.h> // For fstat()
int main(int argc, char *argv[]) {
if (argc != 2) {
fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
exit(EXIT_FAILURE);
}
const char *filepath = argv[1];
const char *new_text = "Greetings from Raspberry Pi 5!";
// 1. Open the file for reading and writing.
int fd = open(filepath, O_RDWR);
if (fd == -1) {
perror("Error opening file");
exit(EXIT_FAILURE);
}
// 2. Get file statistics to determine its size.
struct stat file_stat;
if (fstat(fd, &file_stat) == -1) {
perror("Error getting file size");
close(fd);
exit(EXIT_FAILURE);
}
off_t file_size = file_stat.st_size;
printf("Original file size: %ld bytes\n", file_size);
// 3. Map the file into memory.
// - addr = NULL: Let the kernel choose the address.
// - length = file_size: Map the entire file.
// - prot = PROT_READ | PROT_WRITE: We want to read and write.
// - flags = MAP_SHARED: Changes should be written back to the file.
// - fd: The file descriptor of our open file.
// - offset = 0: Start the mapping from the beginning of the file.
char *mapped_region = mmap(NULL, file_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (mapped_region == MAP_FAILED) {
perror("Error mapping file");
close(fd);
exit(EXIT_FAILURE);
}
// The file descriptor is no longer needed after mmap, so we can close it.
// The mapping will remain active.
close(fd);
// 4. Interact with the file as if it were a character array in memory.
printf("Original file content: %s\n", mapped_region);
// Overwrite the beginning of the file with our new text.
// We must be careful not to write past the end of the mapped region (file_size).
size_t new_text_len = strlen(new_text);
if (new_text_len > file_size) {
fprintf(stderr, "New text is larger than the file. Truncating.\n");
new_text_len = file_size;
}
// Use memcpy to safely copy the data.
memcpy(mapped_region, new_text, new_text_len);
printf("Modified the mapped memory.\n");
// 5. Synchronize the changes back to the disk.
// This ensures the data is persistently stored.
if (msync(mapped_region, file_size, MS_SYNC) == -1) {
perror("Error syncing file to disk");
}
printf("msync complete. Changes should be on disk.\n");
// 6. Unmap the memory region.
// This is crucial to release the resources.
if (munmap(mapped_region, file_size) == -1) {
perror("Error unmapping file");
// Continue to exit, but report the error.
}
printf("Program finished successfully.\n");
return 0;
}
Build, Flash, and Boot Procedures
This example doesn’t involve flashing a device, as we are running it directly on the Raspberry Pi’s OS.
1. Compile the code. Use gcc
to compile the program. The -o
flag specifies the name of the output executable.
gcc mmap_editor.c -o mmap_editor
2. Run the program. Execute the compiled program, passing the mmap_test.txt
file as an argument.
./mmap_editor mmap_test.txt
Expected Output
The program will print messages indicating its progress:
Original file size: 26 bytes
Original file content: Hello Embedded Linux World!
Modified the mapped memory.
msync complete. Changes should be on disk.
Program finished successfully.
Now, check the contents of the file to verify the change:
cat mmap_test.txt
The output should be the new text we wrote from our program:
Greetings from Raspberry Pi 5!
This confirms that by simply modifying a memory region through a pointer, we have successfully edited the underlying file on the disk, thanks to the MAP_SHARED
flag.
Example 2: MAP_PRIVATE
vs. MAP_SHARED
This example clearly illustrates the fundamental difference between MAP_PRIVATE
and MAP_SHARED
. We will map the same file twice, once with each flag, modify both mappings, and observe the effect on the original file.
Build and Configuration Steps
1. Reset the test file. Let’s restore our test file to its original state.
echo "Original Data for Comparison" > compare.txt
2. Create the C source file. Create a new file named mmap_compare.c
.
nano mmap_compare.c
Code Snippet
Copy the following code into mmap_compare.c
.
// mmap_compare.c
// Demonstrates the difference between MAP_SHARED and MAP_PRIVATE.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
void map_and_modify(const char* filepath, int flags, const char* modification) {
printf("\n--- Testing with %s ---\n", (flags == MAP_SHARED) ? "MAP_SHARED" : "MAP_PRIVATE");
int fd = open(filepath, O_RDWR);
if (fd == -1) {
perror("open");
return;
}
struct stat file_stat;
if (fstat(fd, &file_stat) == -1) {
perror("fstat");
close(fd);
return;
}
off_t file_size = file_stat.st_size;
char *map = mmap(NULL, file_size, PROT_READ | PROT_WRITE, flags, fd, 0);
if (map == MAP_FAILED) {
perror("mmap");
close(fd);
return;
}
close(fd);
printf("Original content in mapping: '%.*s'\n", (int)file_size, map);
// Modify the memory
strncpy(map, modification, file_size);
printf("Content after modification: '%.*s'\n", (int)file_size, map);
// For MAP_SHARED, we sync to ensure the change is written back.
if (flags == MAP_SHARED) {
if (msync(map, file_size, MS_SYNC) == -1) {
perror("msync");
}
}
if (munmap(map, file_size) == -1) {
perror("munmap");
}
printf("--- Test Finished ---\n");
}
int main() {
const char *filename = "compare.txt";
const char *shared_mod = "SHARED_WRITE";
const char *private_mod = "PRIVATE_WRITE";
// Test with MAP_PRIVATE first
map_and_modify(filename, MAP_PRIVATE, private_mod);
// Check the file's content after the private mapping test
printf("\nContent of '%s' after MAP_PRIVATE test:\n", filename);
system("cat compare.txt");
// Now test with MAP_SHARED
map_and_modify(filename, MAP_SHARED, shared_mod);
// Check the file's content after the shared mapping test
printf("\nContent of '%s' after MAP_SHARED test:\n", filename);
system("cat compare.txt");
return 0;
}
Build and Run
1. Compile the code.
gcc mmap_compare.c -o mmap_compare
2. Run the executable.
./mmap_compare
Expected Output
The output will clearly show the different behaviors:
--- Testing with MAP_PRIVATE ---
Original content in mapping: 'Original Data for Comparison'
Content after modification: 'PRIVATE_WRITEr Comparison'
--- Test Finished ---
Content of 'compare.txt' after MAP_PRIVATE test:
Original Data for Comparison
--- Testing with MAP_SHARED ---
Original content in mapping: 'Original Data for Comparison'
Content after modification: 'SHARED_WRITEor Comparison'
--- Test Finished ---
Content of 'compare.txt' after MAP_SHARED test:
SHARED_WRITEor Comparison
As you can see, the modification made to the MAP_PRIVATE
mapping was discarded when the mapping was unmapped; the original file remained unchanged. This is the copy-on-write mechanism in action. Conversely, the modification to the MAP_SHARED
mapping was successfully propagated back to the file, permanently altering its content.
Common Mistakes & Troubleshooting
While mmap
is powerful, its direct memory access nature means that errors can have more severe consequences than with traditional I/O. Here are some common pitfalls and how to avoid them.
Exercises
- Modify the Editor: Take the
mmap_editor.c
program and modify it to append text to the end of the file instead of overwriting the beginning. This will require you to useftruncate()
to increase the file’s size before you map it. The new size should accommodate the original content plus the new text. - Implement File Copy with
mmap
: Write a new program calledmmap_copy.c
that copies a source file to a destination file. It should take two command-line arguments:source_path
anddestination_path
. The logic should be:- Open the source file and map it into memory with
PROT_READ
. - Create or truncate the destination file, open it, and use
ftruncate()
to set its size to be the same as the source file. - Map the destination file into memory with
PROT_READ | PROT_WRITE
. - Use a single
memcpy()
call to copy the data from the source mapping to the destination mapping. - Clean up all mappings and file descriptors.
- Open the source file and map it into memory with
- Page-Alignment Calculator: Write a small utility that takes a file offset as a command-line argument. The program should print the system’s page size, the original offset, and the calculated page-aligned offset required for
mmap
. This reinforces the understanding of the page-alignment requirement. Usesysconf(_SC_PAGESIZE)
to get the page size. - Shared Counter: Write two separate programs. The first,
mmap_init.c
, should create a file, truncate it to the size of a singlelong int
, map it, write the value0
to it, and then unmap. The second program,mmap_increment.c
, should map the same file, read thelong int
, print it, increment it, write it back, and unmap. Runmmap_init
once, and then runmmap_increment
multiple times in a row. Observe how the value persists and is shared between separate invocations of the program. This simulates a simple form of persistent, shared state. - Exploring
MAP_PRIVATE
: Modify themmap_compare.c
program. After modifying theMAP_PRIVATE
mapping, instead of immediately unmapping, add asleep(30)
call. While the program is sleeping, open another terminal and inspect the contents of thecompare.txt
file usingcat
. Then, from a third terminal, find the process ID (PID) of your sleeping program and inspect its virtual memory map usingcat /proc/<PID>/maps
. This will show you the memory mappings for the process. Try to identify the private, anonymous page that was created for your copy-on-write modification. This provides a deeper, practical look into what the kernel is doing behind the scenes.
Summary
- Memory-mapped I/O is a high-performance alternative to traditional
read
/write
system calls, eliminating memory copies between the kernel and user space. - The
mmap()
system call requests the kernel to map a file directly into a process’s virtual address space. - The kernel uses demand paging to load file data into physical RAM only when it is actually accessed by the program, triggered by a page fault.
MAP_SHARED
creates a mapping where modifications are written back to the underlying file and are visible to other processes that have also mapped the file.MAP_PRIVATE
creates a copy-on-write (COW) mapping, where modifications are made to a private copy in memory and do not affect the original file.- The
offset
argument tommap()
must be a multiple of the system’s page size. munmap()
is essential for releasing the mapped region and avoiding resource leaks in long-running applications.msync()
provides explicit control over synchronizing changes in aMAP_SHARED
mapping with the persistent storage.- Common errors include segmentation faults from access violations and bus errors from accessing parts of a file that no longer exist.
Further Reading
- Linux man-pages: The authoritative source. Read them carefully on your system:
man 2 mmap
man 2 munmap
man 2 msync
- The Linux Programming Interface by Michael Kerrisk. Chapter 49 provides an exhaustive and excellent treatment of memory mappings.
- Advanced Programming in the UNIX Environment by W. Richard Stevens and Stephen A. Rago. A classic text with deep insights into UNIX/Linux system calls.
- LWN.net: An excellent source for in-depth articles on kernel internals. Search for articles related to the memory management subsystem and
mmap
. - POSIX.1-2017 Standard: The official standard defining
mmap
and related functions. Available from The Open Group website. - “Anatomy of a Program in Memory” – A classic article explaining process memory layout, which provides crucial context for understanding virtual address space. (Many versions of this exist online; find a well-regarded one).
- Raspberry Pi Documentation: While not specific to
mmap
, the official hardware documentation can provide context for the underlying architecture (e.g., MMU capabilities). https://www.raspberrypi.com/documentation/