Chapter 119: Protocol Buffers on ESP32
Chapter Objectives
By the end of this chapter, you will be able to:
- Understand what Protocol Buffers (Protobuf) are and their advantages for IoT applications.
- Define data structures (messages) using the Protobuf
.protosyntax. - Compile
.protofiles into C code suitable for microcontrollers usingprotocand thenanopbgenerator. - Integrate the
nanopblibrary into an ESP-IDF project. - Serialize C structures into the compact Protobuf binary format on an ESP32.
- Deserialize Protobuf binary data back into C structures on an ESP32.
- Understand the trade-offs (code size, performance, memory usage) of using Protobuf on ESP32.
- Recognize how Protobuf can be used with various communication protocols (MQTT, CoAP, HTTP) for efficient data exchange.
Introduction
In the world of IoT and embedded systems, efficient communication is paramount. Devices often operate with limited processing power, memory, and network bandwidth. While text-based data formats like JSON and XML are human-readable and widely used, they can be verbose and computationally expensive to parse on microcontrollers like the ESP32. This overhead can impact performance, power consumption, and data transmission costs.
graph TB
subgraph Protocol Buffer Exchange
direction TB
B1["<center><b>Device A</b><br>(e.g., ESP32)</center>"]
B2["<center><b>Data</b><br>(Defined in .proto)</center>"]
B3["<center><b>Serialization</b><br>(Protobuf Encoding)</center>"]
B4["<center><b>Compact Payload</b><br>(Binary, Smaller Size)</center>"]
B5["<center><b>Network Transmission</b></center>"]
B6["<center><b>Device B / Server</b></center>"]
B7["<center><b>Deserialization</b><br>(Protobuf Decoding)<br><i>Faster, Less Intensive</i></center>"]
B8["<center><b>Processed Data</b></center>"]
B1 --> B2 --> B3 --> B4 --> B5 --> B6 --> B7 --> B8
end
subgraph Traditional Data Exchange
direction TB
A1["<center><b>Device A</b><br>(e.g., ESP32)</center>"]
A2["<center><b>Data</b><br>(Sensor Readings, etc.)</center>"]
A3["<center><b>Serialization</b><br>(JSON / XML)</center>"]
A4["<center><b>Verbose Payload</b><br>(Text-based, Larger Size)</center>"]
A5["<center><b>Network Transmission</b></center>"]
A6["<center><b>Device B / Server</b></center>"]
A7["<center><b>Deserialization</b><br>(JSON / XML Parsing)<br><i>Computationally Intensive</i></center>"]
A8["<center><b>Processed Data</b></center>"]
A1 --> A2 --> A3 --> A4 --> A5 --> A6 --> A7 --> A8
end
Problem["<center><b>IoT Communication Challenge</b><br>Limited Resources, Efficiency Needed</center>"] --> Traditional_Data_Exchange("Traditional Approach")
Problem --> Protocol_Buffer_Exchange("Protobuf Approach")
style Problem fill:#FEE2E2,stroke:#DC2626,stroke-width:2px,color:#991B1B
style Traditional_Data_Exchange fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E
style Protocol_Buffer_Exchange fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46
style A1 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style A2 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style A3 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style A4 fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B
style A5 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style A6 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style A7 fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B
style A8 fill:#D1FAE5,stroke:#059669,stroke-width:1px,color:#065F46
style B1 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style B2 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style B3 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style B4 fill:#D1FAE5,stroke:#059669,stroke-width:1px,color:#065F46
style B5 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style B6 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style B7 fill:#D1FAE5,stroke:#059669,stroke-width:1px,color:#065F46
style B8 fill:#D1FAE5,stroke:#059669,stroke-width:1px,color:#065F46
Protocol Buffers (Protobuf), developed by Google, offer a compelling alternative. They are a language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but much smaller, faster, and simpler. By defining your data structure once, you can use generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.
This chapter will guide you through understanding Protobuf, defining messages, and using the nanopb library – a specialized Protobuf implementation designed for resource-constrained environments – to efficiently serialize and deserialize data on your ESP32 projects. Mastering Protobuf will enable you to build more efficient and robust IoT applications.
Theory
What are Protocol Buffers?
Protocol Buffers provide a way to define structured data (called “messages”) and compile these definitions into code that can encode (serialize) and decode (parse/deserialize) these messages into a compact binary format.
Key advantages include:
- Efficiency: Protobuf messages are typically much smaller than equivalent JSON or XML representations due to their binary nature and optimized encoding schemes. Parsing is also significantly faster.
- Strong Typing and Schemas: Data structures are explicitly defined in
.protofiles. This schema enforcement helps prevent errors and ensures data consistency. - Language Neutrality:
protoc(the Protobuf compiler) can generate code for various languages (Java, Python, C++, C#, Go, etc.). For C on microcontrollers, we typically use a specialized generator likenanopb. - Backward and Forward Compatibility: Protobuf is designed to allow changes to message definitions (like adding new fields) without breaking existing code, provided certain rules are followed. This is crucial for evolving systems.
Think of it as defining a struct in C, but with a standardized way to turn that struct into a compact byte stream that another system (even one written in a different language) can understand and reconstruct into an equivalent structure.
Defining Messages (.proto files)
You define your Protobuf messages in text files with a .proto extension. The syntax is straightforward. We’ll focus on proto3 syntax, which is generally recommended for new projects.
Basic Syntax Elements:
syntax = "proto3";: Specifies the syntax version.message MessageName { ... }: Defines a message type, similar to a class or struct.- Field Types:
- Scalar Types:
int32,int64,uint32,uint64,sint32(signed, uses ZigZag encoding for efficiency with negative numbers),sint64,bool,float,double,string,bytes. enum EnumName { ... }: Defines an enumeration type.- Nested Messages: You can define messages within other messages.
- Scalar Types:
- Field Rules:
repeated: For fields that can occur multiple times (like arrays or lists).- Singular fields (default in proto3): Fields that can occur zero or one time. Proto3 does not have
requiredfields like proto2; presence is determined by whether a field is set.
- Field Numbers (Tags):
type field_name = N;- Each field in a message definition has a unique number (tag). These numbers are used to identify your fields in the binary message format and should not be changed once your message type is in use.
- Tags from 1 to 15 take one byte to encode (including the field type). Tags from 16 to 2047 take two bytes. So, you should use tags 1 through 15 for your most frequently used fields.
| Proto3 Type | C Equivalent (Typical with Nanopb) | Description & Encoding Notes |
|---|---|---|
double |
double |
64-bit floating-point. Wire type: 8-byte fixed. |
float |
float |
32-bit floating-point. Wire type: 4-byte fixed. |
int32 |
int32_t |
32-bit integer. Uses Varint encoding. Inefficient for negative numbers. |
int64 |
int64_t |
64-bit integer. Uses Varint encoding. Inefficient for negative numbers. |
uint32 |
uint32_t |
32-bit unsigned integer. Uses Varint encoding. |
uint64 |
uint64_t |
64-bit unsigned integer. Uses Varint encoding. |
sint32 |
int32_t |
Signed 32-bit integer. Uses Varint encoding with ZigZag encoding for negative numbers (more efficient for signed values). |
sint64 |
int64_t |
Signed 64-bit integer. Uses Varint encoding with ZigZag encoding. |
fixed32 |
uint32_t |
32-bit unsigned integer. Always four bytes. More efficient than uint32 if values are often larger than 228. Wire type: 4-byte fixed. |
fixed64 |
uint64_t |
64-bit unsigned integer. Always eight bytes. More efficient than uint64 if values are often larger than 256. Wire type: 8-byte fixed. |
sfixed32 |
int32_t |
Signed 32-bit integer. Always four bytes. Wire type: 4-byte fixed. |
sfixed64 |
int64_t |
Signed 64-bit integer. Always eight bytes. Wire type: 8-byte fixed. |
bool |
bool / pb_bool_t |
Boolean value. Uses Varint encoding (0 or 1). |
string |
char[] (with max_size in .options) or callback |
UTF-8 encoded or 7-bit ASCII string. Wire type: Length-delimited. |
bytes |
pb_byte_t[] (with max_size in .options) or callback |
Arbitrary sequence of bytes. Wire type: Length-delimited. |
Example .proto file (sensor.proto):
syntax = "proto3";
message SensorReading {
enum SensorType {
TEMPERATURE = 0;
HUMIDITY = 1;
PRESSURE = 2;
}
uint64 timestamp = 1; // Unix timestamp in milliseconds
float value = 2;
SensorType type = 3;
string location = 4; // Optional location identifier
repeated float calibration_coeffs = 5; // Example of a repeated field
bool battery_low = 6;
}
Compilation Process
Once you have a .proto file, you use the Protocol Buffer compiler, protoc, to generate data access classes (or structs and functions in C) in your chosen programming language.
For C on microcontrollers, the standard Google protoc output is often too heavy (relying on dynamic memory, complex C++ classes). This is where nanopb comes in. nanopb is a plain C implementation of Protocol Buffers specifically designed for restricted systems. It provides a Python script (nanopb_generator.py) that works as a plugin for protoc.
The process is typically:
- Install
protoc. - Get
nanopb(which includes the generator script and the runtime library). - Run
protocwith thenanopbplugin:protoc --plugin=protoc-gen-nanopb=path/to/nanopb/generator/nanopb_generator.py --nanopb_out=. sensor.protoThis command tellsprotocto use thenanopb_generator.pyscript to generate C output files (.pb.cand.pb.h) in the current directory (.).
graph LR
A["<center><b>sensor.proto</b><br><i>Message Definitions</i></center>"] --> B{{"<center><b>protoc</b><br><i>Protocol Buffer Compiler</i></center>"}};
C["<center><b>nanopb_generator.py</b><br><i>Nanopb Python Plugin</i></center>"] --> B;
B --> D["<center><b>sensor.pb.h</b><br><i>Generated C Header</i><br>(Structs, Enums, Declarations)</center>"];
B --> E["<center><b>sensor.pb.c</b><br><i>Generated C Source</i><br>(Encode/Decode Functions, Field Descriptors)</center>"];
D --> F["<center><b>ESP32 Project</b><br><i>Application Code</i></center>"];
E --> F;
G["<center><b>nanopb Library</b><br><i>Core Runtime</i></center>"] --> F;
subgraph "Input Files"
direction LR
A
C
end
subgraph "Compilation Tool"
direction LR
B
end
subgraph "Generated Files"
direction LR
D
E
end
subgraph "Integration"
direction LR
F
G
end
style A fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6
style C fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6
style B fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E
style D fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46
style E fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46
style F fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
style G fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
The generated .pb.h file will contain:
- C struct definitions corresponding to your messages (e.g.,
SensorReading). - Field default value definitions.
- Macros for initializing messages.
- Declarations for encode and decode functions.
The generated .pb.c file will contain:
- The implementation of these encode and decode functions.
- A “fields array” (
_fields) that describes the message structure to thenanopbruntime.
Nanopb: Protobuf for Microcontrollers
nanopb is ideal for ESP32 and other MCUs because:
| Nanopb Feature | Description | Benefit for ESP32 |
|---|---|---|
| Small Code Size | The core Nanopb library is very compact, and generated code is optimized for size. | Conserves precious flash memory on the ESP32, leaving more space for application logic. |
| Low RAM Usage | Avoids dynamic memory allocation by default. Data is typically stored in C structs on the stack or statically. | Crucial for ESP32’s limited RAM, preventing memory fragmentation and out-of-memory errors. |
| Streaming Support | Supports callback mechanisms for encoding/decoding large or variable-length fields (e.g., strings, bytes, repeated fields). | Allows processing of data larger than available RAM by handling it in chunks, essential for logs, firmware updates, or large sensor readings. |
| Customizable | .options files allow setting constraints like maximum string lengths, array counts, or field sizes. |
Enables fine-grained control over memory allocation for message structures, helping to fit data within fixed memory budgets. |
| Plain C Implementation | Written in C, making it easy to integrate into C/C++ ESP-IDF projects. | No C++ overhead, straightforward compilation and linking within the ESP-IDF build system. |
- Small Code Size: The library itself is very compact.
- Low RAM Usage: It avoids dynamic memory allocation by default for message structures. Data is typically stored directly in C structs.
- Streaming Support: For fields that might be very large (like
bytesorrepeatedfields),nanopbsupports callback mechanisms. This allows you to process data in chunks without buffering the entire field in memory. - Customizable: Various options can be set in a
.optionsfile (alongside your.protofile) to control aspects like maximum field sizes, string lengths, or array counts, helping to manage memory.
Wire Format (Briefly)
Protobuf messages are encoded into a binary wire format. Each field is encoded as a key-value pair.
- The key consists of the field number (tag) and a wire type (e.g., varint, 64-bit fixed, length-delimited, 32-bit fixed).
- The value is the actual data for that field.
Scalar numeric types often use Varints, an encoding scheme that uses one or more bytes to serialize integers – smaller numbers take fewer bytes. This contributes significantly to the compactness of Protobuf messages. Strings, bytes, and embedded messages are length-delimited.
This structured binary format is what makes Protobuf efficient to parse and allows for schema evolution (e.g., adding new fields with new tags won’t break older parsers, which will simply ignore unknown tags).
Practical Examples
We’ll use nanopb for our ESP32 examples.
Setting up nanopb with ESP-IDF
Integrating nanopb involves a few steps:
- Adding
nanopbto your project:- The easiest way is to add
nanopbas an ESP-IDF component. You can clone thenanopbrepository (or download a release) into your project’scomponentsdirectory. - Alternatively, you can use the ESP-IDF component manager:
idf.py add-dependency "nanopb/nanopb^0.4.7"(check for the latest compatible version).
- The easiest way is to add
- Ensuring
protocis available:- You need to install the Protocol Buffer compiler (
protoc) on your development machine. Download it from the official Google Protocol Buffers GitHub releases page. Make sure it’s in your system’s PATH.
- You need to install the Protocol Buffer compiler (
- Automating
.protocompilation with CMake:- You’ll want
protocto run automatically during the ESP-IDF build process whenever your.protofiles change. This is done by adding custom CMake commands to your component’sCMakeLists.txtor your project’s mainCMakeLists.txt.
- You’ll want
graph TD
A["<center><b>Start: ESP-IDF Project</b></center>"] --> B{"<center>1. Add Nanopb to Project</center>"};
B --"Option A: As Component"--> B_A["<center>Clone/Download Nanopb<br>into 'components' directory</center>"];
B --"Option B: Component Manager"--> B_B["<center>Run:<br><tt>idf.py add-dependency \nanopb/nanopb^0.4.7\</tt></center>"];
B_A --> C{"<center>2. Ensure protoc is Available</center>"};
B_B --> C;
C --> C1["<center>Download Protoc Compiler<br>from Google's GitHub</center>"];
C1 --> C2["<center>Add protoc to System PATH</center>"];
C2 --> D{"<center>3. Automate .proto Compilation</center>"};
D --> D1["<center>Modify <tt>CMakeLists.txt</tt><br>(Project or Component level)</center>"];
D1 --> D2["<center>Add <tt>find_program(PROTOC_COMPILER protoc)</tt></center>"];
D1 --> D3["<center>Set path to <tt>nanopb_generator.py</tt></center>"];
D1 --> D4["<center>Add <tt>add_custom_command(...)</tt><br>to invoke protoc with nanopb plugin</center>"];
D4 --> D5["<center>Specify <tt>--nanopb_out</tt> and <tt>--proto_path</tt></center>"];
D5 --> E{"<center>4. Link Generated Files & Nanopb Lib</center>"};
E --> E1["<center>Add generated <tt>.pb.c</tt> files to <tt>target_sources</tt> or <tt>SRCS</tt></center>"];
E --> E2["<center>Add generated headers path to <tt>target_include_directories</tt> or <tt>INCLUDE_DIRS</tt></center>"];
E --> E3["<center>Ensure main component <tt>REQUIRES nanopb</tt><br>in <tt>idf_component_register</tt></center>"];
E3 --> F["<center><b>Integration Complete</b><br>Build Project (<tt>idf.py build</tt>)</center>"];
classDef startNode fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6;
classDef processNode fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF;
classDef decisionNode fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E;
classDef checkNode fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B;
classDef endNode fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46;
class A startNode;
class B,C,D,E decisionNode;
class B_A,B_B,C1,C2,D1,D2,D3,D4,D5,E1,E2,E3 processNode;
class F endNode;
CMake Integration Example (in your main CMakeLists.txt or a component’s CMakeLists.txt):
Assuming nanopb is a component and your .proto files are in a proto subdirectory of your main directory:
# Find protoc compiler
find_program(PROTOC_COMPILER protoc)
if(NOT PROTOC_COMPILER)
message(FATAL_ERROR "protoc compiler not found. Please install Protocol Buffers compiler (protoc) and ensure it's in your PATH.")
endif()
# Path to nanopb generator script (adjust if nanopb is located differently)
set(NANOPB_GENERATOR ${CMAKE_CURRENT_LIST_DIR}/../components/nanopb/generator/nanopb_generator.py)
# Or if using idf_component_manager, it might be:
# set(NANOPB_GENERATOR ${CMAKE_PROJECT_PATH}/managed_components/nanopb__nanopb/generator/nanopb_generator.py)
# Check the actual path after adding the dependency.
# Directory for generated files
set(PROTO_GEN_DIR ${CMAKE_BINARY_DIR}/proto_gen)
file(MAKE_DIRECTORY ${PROTO_GEN_DIR})
# List your .proto files
set(PROTO_FILES ${CMAKE_CURRENT_LIST_DIR}/proto/sensor.proto) # Add more if needed
# Custom command to generate .pb.c and .pb.h files
foreach(PROTO_FILE ${PROTO_FILES})
get_filename_component(PROTO_FILENAME ${PROTO_FILE} NAME_WE) # Get filename without extension
set(OUTPUT_C ${PROTO_GEN_DIR}/${PROTO_FILENAME}.pb.c)
set(OUTPUT_H ${PROTO_GEN_DIR}/${PROTO_FILENAME}.pb.h)
add_custom_command(
OUTPUT ${OUTPUT_C} ${OUTPUT_H}
COMMAND ${PROTOC_COMPILER}
--plugin=protoc-gen-nanopb=${NANOPB_GENERATOR}
--nanopb_out=${PROTO_GEN_DIR}
--proto_path=${CMAKE_CURRENT_LIST_DIR}/proto # Directory containing your .proto files
${PROTO_FILE}
DEPENDS ${PROTO_FILE} ${NANOPB_GENERATOR} # Re-run if .proto or generator changes
COMMENT "Generating C sources from ${PROTO_FILE}"
VERBATIM
)
list(APPEND PROTO_GENERATED_SOURCES ${OUTPUT_C})
list(APPEND PROTO_GENERATED_HEADERS ${OUTPUT_H})
endforeach()
# Add generated sources to your project's build
# This assumes your main executable target is 'app' (default for ESP-IDF)
target_sources(app PRIVATE ${PROTO_GENERATED_SOURCES})
# Add directory of generated headers to include paths
target_include_directories(app PRIVATE ${PROTO_GEN_DIR})
# Also ensure the nanopb runtime library is linked.
# If nanopb is a component, ESP-IDF's build system usually handles this.
# For example, if your nanopb component is named 'nanopb':
# target_link_libraries(app PRIVATE nanopb)
# Or, more commonly, in component CMakeLists.txt:
# register_component()
# Then in your main CMakeLists.txt, if nanopb is a public dependency of another component:
# idf_component_get_property(nanopb_include_dir nanopb INCLUDE_DIRS)
# target_include_directories(app PRIVATE ${nanopb_include_dir})
# idf_component_get_property(nanopb_lib nanopb LIBRARIES)
# target_link_libraries(app PRIVATE ${nanopb_lib})
# However, typically, if nanopb is a component correctly set up,
# just requiring it in your component's idf_component_register should suffice.
# For this example, ensure your main component requires nanopb:
# In main/CMakeLists.txt:
# idf_component_register(SRCS "your_main_app.c" ${PROTO_GENERATED_SOURCES}
# INCLUDE_DIRS "." ${PROTO_GEN_DIR}
# REQUIRES nanopb) # This is key!
Important: The CMakeLists.txt for your main component (main/CMakeLists.txt) should then be updated to include the generated sources and require nanopb. For example:
# In main/CMakeLists.txt
# (Assuming the above CMake code for generation is in project's main CMakeLists.txt)
# Get the list of generated sources (if not already available in this scope)
# This might be better handled by setting a global property or passing variables.
# For simplicity, if the generation commands are in the project-level CMakeLists.txt,
# PROTO_GENERATED_SOURCES will be available.
idf_component_register(SRCS "your_main_app_file.c" # Add your main C file(s)
# Add ${PROTO_GENERATED_SOURCES} if generation is in project CMakeLists.txt
# and not directly handled by target_sources(app ...)
INCLUDE_DIRS "." "${CMAKE_BINARY_DIR}/proto_gen" # Add generated headers path
REQUIRES nanopb) # This links against nanopb component library
If you put the generation logic inside main/CMakeLists.txt itself, then PROTO_GENERATED_SOURCES can be directly used in SRCS.
Tip: Create a proto subdirectory in your main component (e.g., main/proto/) to keep your .proto files organized.
Example 1: Simple Sensor Data Serialization/Deserialization
Let’s use the sensor.proto file defined earlier.
1. main/proto/sensor.proto:
(Content as shown in the Theory section)
2. main/proto/sensor.options (Optional, for nanopb customization):
This file allows you to specify constraints for nanopb. For example, to limit the length of the location string and the count of calibration_coeffs:
SensorReading.location max_size:16
SensorReading.calibration_coeffs max_count:5
Place this file in the same directory as sensor.proto. nanopb_generator.py will automatically pick it up if the names match.
3. main/protobuf_example_main.c:
#include <stdio.h>
#include <string.h>
#include <freertos/FreeRTOS.h>
#include <freertos/task.h>
#include "esp_log.h"
#include "nvs_flash.h"
// Include generated protobuf header and nanopb core header
#include "sensor.pb.h" // This will be in your build/proto_gen/ directory
#include "pb_encode.h"
#include "pb_decode.h"
static const char *TAG = "PROTOBUF_EXAMPLE";
void protobuf_demo_task(void *pvParameters)
{
// --- Serialization Example ---
SensorReading sensor_data_tx = SensorReading_init_zero; // Initialize with defaults
uint8_t tx_buffer[128]; // Buffer to hold serialized data
pb_ostream_t tx_stream;
// Populate the SensorReading message
sensor_data_tx.timestamp = 1678886400000ULL; // Example timestamp
sensor_data_tx.value = 25.7f;
sensor_data_tx.type = SensorReading_SensorType_TEMPERATURE;
strcpy(sensor_data_tx.location, "LivingRoom"); // Max 16 chars due to .options
sensor_data_tx.has_location = true; // For optional scalar fields in proto3, use has_ 'if' you want to distinguish not set vs default. Nanopb generates this.
// For strings and bytes, empty means not present if not using has_ field.
// For nanopb, typically if a string field is empty, it's not encoded.
// If you set a string, it's encoded. 'has_location' is more for scalar types
// or if you need to explicitly differentiate empty string from not-set.
// For simplicity with strings, just setting it is often enough.
// Let's assume .options ensures location is always present if non-empty.
// If no .options, nanopb might not generate has_location for string.
sensor_data_tx.calibration_coeffs_count = 2;
sensor_data_tx.calibration_coeffs[0] = 1.01f;
sensor_data_tx.calibration_coeffs[1] = -0.5f;
sensor_data_tx.battery_low = false;
// Create an output stream for the buffer
tx_stream = pb_ostream_from_buffer(tx_buffer, sizeof(tx_buffer));
// Encode the message
bool status = pb_encode(&tx_stream, SensorReading_fields, &sensor_data_tx);
size_t message_length = tx_stream.bytes_written;
if (!status) {
ESP_LOGE(TAG, "Encoding failed: %s", PB_GET_ERROR(&tx_stream));
vTaskDelete(NULL);
return;
}
ESP_LOGI(TAG, "Successfully encoded message (%d bytes):", message_length);
ESP_LOG_BUFFER_HEX(TAG, tx_buffer, message_length);
// --- Deserialization Example ---
SensorReading sensor_data_rx = SensorReading_init_zero; // Initialize to defaults
pb_istream_t rx_stream;
// Create an input stream from the buffer containing the serialized data
rx_stream = pb_istream_from_buffer(tx_buffer, message_length);
// Decode the message
status = pb_decode(&rx_stream, SensorReading_fields, &sensor_data_rx);
if (!status) {
ESP_LOGE(TAG, "Decoding failed: %s", PB_GET_ERROR(&rx_stream));
vTaskDelete(NULL);
return;
}
ESP_LOGI(TAG, "Successfully decoded message:");
ESP_LOGI(TAG, "Timestamp: %llu", sensor_data_rx.timestamp);
ESP_LOGI(TAG, "Value: %.2f", sensor_data_rx.value);
ESP_LOGI(TAG, "Type: %d (%s)", sensor_data_rx.type,
sensor_data_rx.type == SensorReading_SensorType_TEMPERATURE ? "TEMPERATURE" :
sensor_data_rx.type == SensorReading_SensorType_HUMIDITY ? "HUMIDITY" : "PRESSURE");
// For strings, check if it was present. Nanopb typically null-terminates strings.
// If using .options with max_size, the char array is fixed.
// If not using .options for string size, it might be a char* needing callbacks or careful handling.
// With max_size in .options, it's a char array.
ESP_LOGI(TAG, "Location: %s", sensor_data_rx.location); // Assumes location is null-terminated by nanopb
ESP_LOGI(TAG, "Calibration Coefficients (%d):", sensor_data_rx.calibration_coeffs_count);
for (int i = 0; i < sensor_data_rx.calibration_coeffs_count; i++) {
ESP_LOGI(TAG, " Coeff %d: %.2f", i, sensor_data_rx.calibration_coeffs[i]);
}
ESP_LOGI(TAG, "Battery Low: %s", sensor_data_rx.battery_low ? "true" : "false");
vTaskDelete(NULL);
}
void app_main(void)
{
// Initialize NVS - boilerplate
esp_err_t ret = nvs_flash_init();
if (ret == ESP_ERR_NVS_NO_FREE_PAGES || ret == ESP_ERR_NVS_NEW_VERSION_FOUND) {
ESP_ERROR_CHECK(nvs_flash_erase());
ret = nvs_flash_init();
}
ESP_ERROR_CHECK(ret);
xTaskCreate(protobuf_demo_task, "protobuf_demo_task", 4096, NULL, 5, NULL);
}
4. Build Instructions:
- Ensure
protocis installed and in your PATH. - Ensure
nanopbis correctly added as a component to your ESP-IDF project (e.g., incomponents/nanopbor viaidf.py add-dependency). - Verify your CMake setup for
protocandnanopb_generator.pyas described above. - Run:
idf.py set-target esp32 # or your target idf.py build
5. Run/Flash/Observe:
- Flash the firmware:
idf.py -p /dev/ttyUSB0 flash monitor. - Observe the serial monitor output. You should see:
- Logs indicating successful encoding and the hex dump of the serialized data.
- Logs indicating successful decoding and the reconstructed data matching the original.
- The size of the encoded message, which should be significantly smaller than an equivalent JSON string.
Example 2: Using Protobuf with a Communication Protocol (Conceptual)
Protobuf is a serialization format, not a transport protocol. It’s commonly used with protocols like MQTT, CoAP, HTTP, or even raw TCP/UDP sockets.
Scenario: Sending SensorReading over MQTT.
- Serialize: Your ESP32 collects sensor data, populates the
SensorReadingC struct, and serializes it intotx_buffer(resulting inmessage_lengthbytes) as shown in Example 1. - Transmit: This
tx_buffer(ofmessage_lengthbytes) becomes the payload of your MQTT message.// Conceptual MQTT publish (assuming mqtt_client is initialized)// esp_mqtt_client_publish(mqtt_client, "esp32/sensor/data", (const char *)tx_buffer, message_length, 1, 0); - Receive (on another device/server):
- The MQTT subscriber receives the byte array.
- It uses Protobuf (with the same
.protodefinition compiled for its language, e.g., Python, Java) to deserialize the bytes back into aSensorReadingmessage object.
Similarly for CoAP:
The tx_buffer could be the payload for a CoAP PUT or POST request. The CoAP server would then deserialize it.
Benefit: The actual data sent over the network (tx_buffer) is compact, reducing bandwidth usage and transmission time.
Variant Notes
- All ESP32 Variants (ESP32, ESP32-S2, ESP32-S3, ESP32-C3, ESP32-C6, ESP32-H2):
- Protocol Buffers, especially with
nanopb, are well-suited for all ESP32 variants.nanopb‘s low resource footprint makes it viable even on single-core, memory-constrained variants like the ESP32-C3.
- Protocol Buffers, especially with
- Performance:
- Encoding and decoding Protobuf messages involve some CPU overhead. More powerful cores (e.g., ESP32-S3, dual-core ESP32) will handle this faster than single-core variants.
- However, this CPU cost is often offset by the significant reduction in data size and parsing complexity compared to text-based formats, especially if network transmission is a bottleneck.
- Memory (Flash/RAM):
- Flash: The
nanopbruntime library is small. The generated.pb.cfiles will add to your code size, proportional to the complexity and number of your message definitions. - RAM:
nanopbprimarily uses stack memory for its operations and for storing the C structs representing your messages (if sizes are constrained via.optionsfiles).- Buffers for serialized/deserialized data need to be allocated (e.g.,
tx_bufferin the example). Their size depends on the maximum expected message size. - For very large or unbounded fields (strings, bytes, repeated fields) not constrained by
.options,nanopbcan use callbacks. This allows processing data in streams without buffering everything in RAM, but adds complexity to your application code.
- Flash: The
- Network Efficiency: The primary benefit across all variants is the reduction in payload size, which is particularly advantageous for battery-powered devices or those on metered/low-bandwidth networks (like LoRaWAN or NB-IoT, often bridged to IP protocols carrying Protobuf).
Common Mistakes & Troubleshooting Tips
| Mistake / Issue | Symptom(s) | Troubleshooting / Solution |
|---|---|---|
| .proto Syntax Errors | protoc compiler fails with error messages pointing to lines in the .proto file. Build may halt. |
Carefully validate .proto syntax (message definitions, field types, tags, enums) against official Protobuf documentation (proto3).Check for typos, missing semicolons, incorrect keywords. |
| protoc Not Found / Path Issues | Build error: protoc: command not found or CMake error protoc compiler not found. |
Ensure protoc is installed on your development machine.Verify that the directory containing the protoc executable is in your system’s PATH environment variable.In CMake, ensure find_program(PROTOC_COMPILER protoc) is successful.
|
| CMake Integration Failure | Generated .pb.c/.pb.h files not created, or not found by the compiler. Linker errors related to Nanopb functions or generated symbols. |
Double-check paths in CMakeLists.txt for nanopb_generator.py, .proto files, and output directories (PROTO_GEN_DIR).Ensure nanopb_generator.py is executable.Verify target_sources and target_include_directories correctly list generated files/paths.Confirm your main component REQUIRES nanopb in idf_component_register.
|
| Outdated Generated Files | Runtime behavior doesn’t match recent .proto file changes. Unexpected data or errors during serialization/deserialization. |
Ensure your build system correctly regenerates .pb.c and .pb.h files when .proto files (or .options files) change. The DEPENDS clause in add_custom_command is crucial.Perform a clean build: idf.py fullclean && idf.py build.
|
| Buffer Overflows / Truncation (Serialization) | pb_encode returns false. PB_GET_ERROR(&stream) might indicate “output buffer too small”. Data sent is incomplete. |
Ensure the buffer provided to pb_ostream_from_buffer is large enough for the fully serialized message. Estimate max size or use Nanopb’s pb_get_encoded_size() if feasible (requires an extra pass).For strings/bytes/repeated fields in structs, if not using .options with max_size/max_count, ensure allocated memory for these pointers is sufficient or use callbacks.
|
| Buffer Issues (Deserialization) | pb_decode returns false. PB_GET_ERROR(&stream) might indicate “unterminated varint”, “invalid wire type”, or “string too long”. Corrupted data after decoding. |
When using pb_istream_from_buffer, ensure the provided message_length matches the actual size of the received Protobuf data.If using .options to define max_size for strings/bytes or max_count for repeated fields, ensure the incoming data doesn’t exceed these limits. If it does, decoding will fail. Consider callbacks for unbounded fields.
|
Incorrect _count for Repeated Fields |
Serialization: Not all elements of a repeated field are encoded. Deserialization: _count is 0 or incorrect, leading to missed data or reading garbage. |
Encoding: Before calling pb_encode, set the fieldname_count member of your struct to the actual number of elements you’ve populated in the array.Decoding: After pb_decode, check fieldname_count to know how many elements were successfully decoded into the array. Loop up to this count.
|
Handling Optional Scalar Fields (proto3) & has_ fields |
Confusion about whether a scalar field (int, bool, enum) was actually present in the message or just has its default value (e.g., 0 for int, false for bool). Nanopb generates has_fieldname for scalar fields if not explicitly disabled. |
Encoding: If you set a scalar field, Nanopb typically encodes it. If you want to explicitly mark a scalar field as present (even if it’s the default value), set your_message.has_fieldname = true; before encoding.Decoding: After decoding, check your_message.has_fieldname. If true, the field was present in the stream. If false, it was not, and the field in your struct will hold its default value (e.g., 0, false).For strings and bytes, an empty string/byte array is typically how “not present” is handled if has_ fields are not used or generated for them.
|
Mismatched .proto / .options Files |
Compiler errors about unknown fields in .options, or runtime issues if .options constraints (e.g., max_size) don’t match the .proto definition used by the other end. |
Ensure the .options file (e.g., MyMessage.options) corresponds exactly to the MyMessage.proto file it’s meant to customize.All communicating parties must use compatible .proto definitions. If one side uses .options to limit string size, the other side must be aware it might receive truncated data if it sends longer strings.
|
Exercises
- Device Configuration Message:
- Define a
.protomessage namedDeviceConfigwith fields for Wi-Fi SSID (string), password (string), an update interval (uint32, in seconds), and a list of enabled sensor types (repeated enum, usingSensorTypefrom the chapter example). - Write an ESP32 application that:
- Initializes a
DeviceConfigstruct with sample data. - Serializes it to a buffer and logs the hex output.
- Deserializes it back and prints the configuration.
- Initializes a
- Use an
.optionsfile to setmax_sizefor SSID and password, andmax_countfor enabled sensor types.
- Define a
- Protobuf with Callbacks for Large Data:
- Modify the
SensorReadingmessage to include abytes diagnostic_log = 7;field. - Research and implement
nanopbcallbacks for thisdiagnostic_logfield. - Serialization: Instead of having a large byte array in your C struct, use an encoding callback that provides data for
diagnostic_login chunks (e.g., read from a dummy source or generate it). - Deserialization: Use a decoding callback that receives chunks of
diagnostic_logdata and processes them (e.g., prints them to the console) instead of buffering the entire log. - This exercise demonstrates handling data larger than available RAM.
- Modify the
- Error Handling and Validation:
- Take the first practical example (
SensorReading). - Introduce error conditions:
- Try to encode into a buffer that is too small. Check the return status of
pb_encodeand logPB_GET_ERROR(&tx_stream). - Simulate corrupted Protobuf data (e.g., by slightly altering the
tx_bufferafter encoding) and try to decode it. Check the return status ofpb_decodeand logPB_GET_ERROR(&rx_stream).
- Try to encode into a buffer that is too small. Check the return status of
- Implement basic validation after decoding (e.g., check if
sensor_data_rx.valueis within an expected range for the givensensor_data_rx.type).
- Take the first practical example (
Summary
- Protocol Buffers (Protobuf) offer an efficient, strongly-typed, and language-neutral way to serialize structured data, ideal for resource-constrained IoT devices like the ESP32.
- Data structures (messages) are defined in
.protofiles using a clear syntax. - The
protoccompiler, along with a C-specific plugin likenanopb, generates C structs and encode/decode functions from.protodefinitions. nanopbis specifically designed for microcontrollers, offering low code size, minimal RAM usage, and no dynamic memory allocation by default.- Serialization (encoding) converts C structs into a compact binary format. Deserialization (decoding) reconstructs C structs from this binary data.
- The binary wire format is compact due to techniques like Varint encoding and tag-based field identification, which also supports schema evolution.
- Protobuf serialized data can be used as a payload in various communication protocols (MQTT, CoAP, HTTP, etc.), significantly reducing network bandwidth.
- Proper CMake integration is crucial for automating the generation of Protobuf C files within an ESP-IDF project.
Further Reading
- Google Protocol Buffers Developer Guide (Proto3): https://developers.google.com/protocol-buffers/docs/proto3
- Nanopb Documentation: https://jpa.kapsi.fi/nanopb/docs/ (Especially the “Getting started” and “Reference” sections)
- Nanopb GitHub Repository (source, examples): https://github.com/nanopb/nanopb
- ESP-IDF Component Registry (for
nanopb): Search fornanopbon https://components.espressif.com/ for managed component integration.

