Chapter 119: Protocol Buffers on ESP32

Chapter Objectives

By the end of this chapter, you will be able to:

Understand what Protocol Buffers (Protobuf) are and their advantages for IoT applications.
Define data structures (messages) using the Protobuf .proto syntax.
Compile .proto files into C code suitable for microcontrollers using protoc and the nanopb generator.
Integrate the nanopb library into an ESP-IDF project.
Serialize C structures into the compact Protobuf binary format on an ESP32.
Deserialize Protobuf binary data back into C structures on an ESP32.
Understand the trade-offs (code size, performance, memory usage) of using Protobuf on ESP32.
Recognize how Protobuf can be used with various communication protocols (MQTT, CoAP, HTTP) for efficient data exchange.

Introduction

In the world of IoT and embedded systems, efficient communication is paramount. Devices often operate with limited processing power, memory, and network bandwidth. While text-based data formats like JSON and XML are human-readable and widely used, they can be verbose and computationally expensive to parse on microcontrollers like the ESP32. This overhead can impact performance, power consumption, and data transmission costs.

graph TB
    subgraph Protocol Buffer Exchange
        direction TB
        B1["<center><b>Device A</b><br>(e.g., ESP32)</center>"]
        B2["<center><b>Data</b><br>(Defined in .proto)</center>"]
        B3["<center><b>Serialization</b><br>(Protobuf Encoding)</center>"]
        B4["<center><b>Compact Payload</b><br>(Binary, Smaller Size)</center>"]
        B5["<center><b>Network Transmission</b></center>"]
        B6["<center><b>Device B / Server</b></center>"]
        B7["<center><b>Deserialization</b><br>(Protobuf Decoding)<br><i>Faster, Less Intensive</i></center>"]
        B8["<center><b>Processed Data</b></center>"]

        B1 --> B2 --> B3 --> B4 --> B5 --> B6 --> B7 --> B8
    end
    subgraph Traditional Data Exchange
        direction TB
        A1["<center><b>Device A</b><br>(e.g., ESP32)</center>"]
        A2["<center><b>Data</b><br>(Sensor Readings, etc.)</center>"]
        A3["<center><b>Serialization</b><br>(JSON / XML)</center>"]
        A4["<center><b>Verbose Payload</b><br>(Text-based, Larger Size)</center>"]
        A5["<center><b>Network Transmission</b></center>"]
        A6["<center><b>Device B / Server</b></center>"]
        A7["<center><b>Deserialization</b><br>(JSON / XML Parsing)<br><i>Computationally Intensive</i></center>"]
        A8["<center><b>Processed Data</b></center>"]

        A1 --> A2 --> A3 --> A4 --> A5 --> A6 --> A7 --> A8
    end



    Problem["<center><b>IoT Communication Challenge</b><br>Limited Resources, Efficiency Needed</center>"] --> Traditional_Data_Exchange("Traditional Approach")
    Problem --> Protocol_Buffer_Exchange("Protobuf Approach")

    style Problem fill:#FEE2E2,stroke:#DC2626,stroke-width:2px,color:#991B1B
    style Traditional_Data_Exchange fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E
    style Protocol_Buffer_Exchange fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46

    style A1 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style A2 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style A3 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style A4 fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B
    style A5 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style A6 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style A7 fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B
    style A8 fill:#D1FAE5,stroke:#059669,stroke-width:1px,color:#065F46

    style B1 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style B2 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style B3 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style B4 fill:#D1FAE5,stroke:#059669,stroke-width:1px,color:#065F46
    style B5 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style B6 fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style B7 fill:#D1FAE5,stroke:#059669,stroke-width:1px,color:#065F46
    style B8 fill:#D1FAE5,stroke:#059669,stroke-width:1px,color:#065F46

Protocol Buffers (Protobuf), developed by Google, offer a compelling alternative. They are a language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but much smaller, faster, and simpler. By defining your data structure once, you can use generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.

This chapter will guide you through understanding Protobuf, defining messages, and using the nanopb library – a specialized Protobuf implementation designed for resource-constrained environments – to efficiently serialize and deserialize data on your ESP32 projects. Mastering Protobuf will enable you to build more efficient and robust IoT applications.

Theory

What are Protocol Buffers?

Protocol Buffers provide a way to define structured data (called “messages”) and compile these definitions into code that can encode (serialize) and decode (parse/deserialize) these messages into a compact binary format.

Key advantages include:

Efficiency: Protobuf messages are typically much smaller than equivalent JSON or XML representations due to their binary nature and optimized encoding schemes. Parsing is also significantly faster.
Strong Typing and Schemas: Data structures are explicitly defined in .proto files. This schema enforcement helps prevent errors and ensures data consistency.
Language Neutrality: protoc (the Protobuf compiler) can generate code for various languages (Java, Python, C++, C#, Go, etc.). For C on microcontrollers, we typically use a specialized generator like nanopb.
Backward and Forward Compatibility: Protobuf is designed to allow changes to message definitions (like adding new fields) without breaking existing code, provided certain rules are followed. This is crucial for evolving systems.

Think of it as defining a struct in C, but with a standardized way to turn that struct into a compact byte stream that another system (even one written in a different language) can understand and reconstruct into an equivalent structure.

Defining Messages (`.proto` files)

You define your Protobuf messages in text files with a .proto extension. The syntax is straightforward. We’ll focus on proto3 syntax, which is generally recommended for new projects.

Basic Syntax Elements:

syntax = "proto3";: Specifies the syntax version.
message MessageName { ... }: Defines a message type, similar to a class or struct.
Field Types:
- Scalar Types: int32, int64, uint32, uint64, sint32 (signed, uses ZigZag encoding for efficiency with negative numbers), sint64, bool, float, double, string, bytes.
- enum EnumName { ... }: Defines an enumeration type.
- Nested Messages: You can define messages within other messages.
Field Rules:
- repeated: For fields that can occur multiple times (like arrays or lists).
- Singular fields (default in proto3): Fields that can occur zero or one time. Proto3 does not have required fields like proto2; presence is determined by whether a field is set.
Field Numbers (Tags):type field_name = N;
- Each field in a message definition has a unique number (tag). These numbers are used to identify your fields in the binary message format and should not be changed once your message type is in use.
- Tags from 1 to 15 take one byte to encode (including the field type). Tags from 16 to 2047 take two bytes. So, you should use tags 1 through 15 for your most frequently used fields.

Proto3 Type	C Equivalent (Typical with Nanopb)	Description & Encoding Notes
`double`	`double`	64-bit floating-point. Wire type: 8-byte fixed.
`float`	`float`	32-bit floating-point. Wire type: 4-byte fixed.
`int32`	`int32_t`	32-bit integer. Uses Varint encoding. Inefficient for negative numbers.
`int64`	`int64_t`	64-bit integer. Uses Varint encoding. Inefficient for negative numbers.
`uint32`	`uint32_t`	32-bit unsigned integer. Uses Varint encoding.
`uint64`	`uint64_t`	64-bit unsigned integer. Uses Varint encoding.
`sint32`	`int32_t`	Signed 32-bit integer. Uses Varint encoding with ZigZag encoding for negative numbers (more efficient for signed values).
`sint64`	`int64_t`	Signed 64-bit integer. Uses Varint encoding with ZigZag encoding.
`fixed32`	`uint32_t`	32-bit unsigned integer. Always four bytes. More efficient than `uint32` if values are often larger than 2²⁸. Wire type: 4-byte fixed.
`fixed64`	`uint64_t`	64-bit unsigned integer. Always eight bytes. More efficient than `uint64` if values are often larger than 2⁵⁶. Wire type: 8-byte fixed.
`sfixed32`	`int32_t`	Signed 32-bit integer. Always four bytes. Wire type: 4-byte fixed.
`sfixed64`	`int64_t`	Signed 64-bit integer. Always eight bytes. Wire type: 8-byte fixed.
`bool`	`bool` / `pb_bool_t`	Boolean value. Uses Varint encoding (0 or 1).
`string`	`char[]` (with `max_size` in .options) or callback	UTF-8 encoded or 7-bit ASCII string. Wire type: Length-delimited.
`bytes`	`pb_byte_t[]` (with `max_size` in .options) or callback	Arbitrary sequence of bytes. Wire type: Length-delimited.

Example .proto file (sensor.proto):

Protocol Buffers

syntax = "proto3";

message SensorReading {
  enum SensorType {
    TEMPERATURE = 0;
    HUMIDITY = 1;
    PRESSURE = 2;
  }

  uint64 timestamp = 1;       // Unix timestamp in milliseconds
  float value = 2;
  SensorType type = 3;
  string location = 4;        // Optional location identifier
  repeated float calibration_coeffs = 5; // Example of a repeated field
  bool battery_low = 6;
}

Compilation Process

Once you have a .proto file, you use the Protocol Buffer compiler, protoc, to generate data access classes (or structs and functions in C) in your chosen programming language.

For C on microcontrollers, the standard Google protoc output is often too heavy (relying on dynamic memory, complex C++ classes). This is where nanopb comes in. nanopb is a plain C implementation of Protocol Buffers specifically designed for restricted systems. It provides a Python script (nanopb_generator.py) that works as a plugin for protoc.

The process is typically:

Install protoc.
Get nanopb (which includes the generator script and the runtime library).
Run protoc with the nanopb plugin:protoc --plugin=protoc-gen-nanopb=path/to/nanopb/generator/nanopb_generator.py --nanopb_out=. sensor.protoThis command tells protoc to use the nanopb_generator.py script to generate C output files (.pb.c and .pb.h) in the current directory (.).

graph LR
    A["<center><b>sensor.proto</b><br><i>Message Definitions</i></center>"] --> B{{"<center><b>protoc</b><br><i>Protocol Buffer Compiler</i></center>"}};
    C["<center><b>nanopb_generator.py</b><br><i>Nanopb Python Plugin</i></center>"] --> B;
    B --> D["<center><b>sensor.pb.h</b><br><i>Generated C Header</i><br>(Structs, Enums, Declarations)</center>"];
    B --> E["<center><b>sensor.pb.c</b><br><i>Generated C Source</i><br>(Encode/Decode Functions, Field Descriptors)</center>"];
    D --> F["<center><b>ESP32 Project</b><br><i>Application Code</i></center>"];
    E --> F;
    G["<center><b>nanopb Library</b><br><i>Core Runtime</i></center>"] --> F;

    subgraph "Input Files"
        direction LR
        A
        C
    end
    subgraph "Compilation Tool"
        direction LR
        B
    end
    subgraph "Generated Files"
        direction LR
        D
        E
    end
    subgraph "Integration"
        direction LR
        F
        G
    end

    style A fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6
    style C fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6
    style B fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E
    style D fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46
    style E fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46
    style F fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    style G fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF

The generated .pb.h file will contain:

C struct definitions corresponding to your messages (e.g., SensorReading).
Field default value definitions.
Macros for initializing messages.
Declarations for encode and decode functions.

The generated .pb.c file will contain:

The implementation of these encode and decode functions.
A “fields array” (_fields) that describes the message structure to the nanopb runtime.

Nanopb: Protobuf for Microcontrollers

nanopb is ideal for ESP32 and other MCUs because:

Nanopb Feature	Description	Benefit for ESP32
Small Code Size	The core Nanopb library is very compact, and generated code is optimized for size.	Conserves precious flash memory on the ESP32, leaving more space for application logic.
Low RAM Usage	Avoids dynamic memory allocation by default. Data is typically stored in C structs on the stack or statically.	Crucial for ESP32’s limited RAM, preventing memory fragmentation and out-of-memory errors.
Streaming Support	Supports callback mechanisms for encoding/decoding large or variable-length fields (e.g., strings, bytes, repeated fields).	Allows processing of data larger than available RAM by handling it in chunks, essential for logs, firmware updates, or large sensor readings.
Customizable	`.options` files allow setting constraints like maximum string lengths, array counts, or field sizes.	Enables fine-grained control over memory allocation for message structures, helping to fit data within fixed memory budgets.
Plain C Implementation	Written in C, making it easy to integrate into C/C++ ESP-IDF projects.	No C++ overhead, straightforward compilation and linking within the ESP-IDF build system.

Small Code Size: The library itself is very compact.
Low RAM Usage: It avoids dynamic memory allocation by default for message structures. Data is typically stored directly in C structs.
Streaming Support: For fields that might be very large (like bytes or repeated fields), nanopb supports callback mechanisms. This allows you to process data in chunks without buffering the entire field in memory.
Customizable: Various options can be set in a .options file (alongside your .proto file) to control aspects like maximum field sizes, string lengths, or array counts, helping to manage memory.

Wire Format (Briefly)

Protobuf messages are encoded into a binary wire format. Each field is encoded as a key-value pair.

The key consists of the field number (tag) and a wire type (e.g., varint, 64-bit fixed, length-delimited, 32-bit fixed).
The value is the actual data for that field.

Scalar numeric types often use Varints, an encoding scheme that uses one or more bytes to serialize integers – smaller numbers take fewer bytes. This contributes significantly to the compactness of Protobuf messages. Strings, bytes, and embedded messages are length-delimited.

This structured binary format is what makes Protobuf efficient to parse and allows for schema evolution (e.g., adding new fields with new tags won’t break older parsers, which will simply ignore unknown tags).

Practical Examples

We’ll use nanopb for our ESP32 examples.

Setting up `nanopb` with ESP-IDF

Integrating nanopb involves a few steps:

Adding nanopb to your project:
- The easiest way is to add nanopb as an ESP-IDF component. You can clone the nanopb repository (or download a release) into your project’s components directory.
- Alternatively, you can use the ESP-IDF component manager: idf.py add-dependency "nanopb/nanopb^0.4.7" (check for the latest compatible version).
Ensuring protoc is available:
- You need to install the Protocol Buffer compiler (protoc) on your development machine. Download it from the official Google Protocol Buffers GitHub releases page. Make sure it’s in your system’s PATH.
Automating .proto compilation with CMake:
- You’ll want protoc to run automatically during the ESP-IDF build process whenever your .proto files change. This is done by adding custom CMake commands to your component’s CMakeLists.txt or your project’s main CMakeLists.txt.

graph TD
    A["<center><b>Start: ESP-IDF Project</b></center>"] --> B{"<center>1. Add Nanopb to Project</center>"};
    B --"Option A: As Component"--> B_A["<center>Clone/Download Nanopb<br>into 'components' directory</center>"];
    B --"Option B: Component Manager"--> B_B["<center>Run:<br><tt>idf.py add-dependency \nanopb/nanopb^0.4.7\</tt></center>"];
    
    B_A --> C{"<center>2. Ensure protoc is Available</center>"};
    B_B --> C;
    
    C --> C1["<center>Download Protoc Compiler<br>from Google's GitHub</center>"];
    C1 --> C2["<center>Add protoc to System PATH</center>"];
    
    C2 --> D{"<center>3. Automate .proto Compilation</center>"};
    D --> D1["<center>Modify <tt>CMakeLists.txt</tt><br>(Project or Component level)</center>"];
    D1 --> D2["<center>Add <tt>find_program(PROTOC_COMPILER protoc)</tt></center>"];
    D1 --> D3["<center>Set path to <tt>nanopb_generator.py</tt></center>"];
    D1 --> D4["<center>Add <tt>add_custom_command(...)</tt><br>to invoke protoc with nanopb plugin</center>"];
    D4 --> D5["<center>Specify <tt>--nanopb_out</tt> and <tt>--proto_path</tt></center>"];
    
    D5 --> E{"<center>4. Link Generated Files & Nanopb Lib</center>"};
    E --> E1["<center>Add generated <tt>.pb.c</tt> files to <tt>target_sources</tt> or <tt>SRCS</tt></center>"];
    E --> E2["<center>Add generated headers path to <tt>target_include_directories</tt> or <tt>INCLUDE_DIRS</tt></center>"];
    E --> E3["<center>Ensure main component <tt>REQUIRES nanopb</tt><br>in <tt>idf_component_register</tt></center>"];
    
    E3 --> F["<center><b>Integration Complete</b><br>Build Project (<tt>idf.py build</tt>)</center>"];

    classDef startNode fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6;
    classDef processNode fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF;
    classDef decisionNode fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E;
    classDef checkNode fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B;
    classDef endNode fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46;

    class A startNode;
    class B,C,D,E decisionNode;
    class B_A,B_B,C1,C2,D1,D2,D3,D4,D5,E1,E2,E3 processNode;
    class F endNode;

CMake Integration Example (in your main CMakeLists.txt or a component’s CMakeLists.txt):

Assuming nanopb is a component and your .proto files are in a proto subdirectory of your main directory:

Bash

# Find protoc compiler
find_program(PROTOC_COMPILER protoc)
if(NOT PROTOC_COMPILER)
    message(FATAL_ERROR "protoc compiler not found. Please install Protocol Buffers compiler (protoc) and ensure it's in your PATH.")
endif()

# Path to nanopb generator script (adjust if nanopb is located differently)
set(NANOPB_GENERATOR ${CMAKE_CURRENT_LIST_DIR}/../components/nanopb/generator/nanopb_generator.py)
# Or if using idf_component_manager, it might be:
# set(NANOPB_GENERATOR ${CMAKE_PROJECT_PATH}/managed_components/nanopb__nanopb/generator/nanopb_generator.py)
# Check the actual path after adding the dependency.

# Directory for generated files
set(PROTO_GEN_DIR ${CMAKE_BINARY_DIR}/proto_gen)
file(MAKE_DIRECTORY ${PROTO_GEN_DIR})

# List your .proto files
set(PROTO_FILES ${CMAKE_CURRENT_LIST_DIR}/proto/sensor.proto) # Add more if needed

# Custom command to generate .pb.c and .pb.h files
foreach(PROTO_FILE ${PROTO_FILES})
    get_filename_component(PROTO_FILENAME ${PROTO_FILE} NAME_WE) # Get filename without extension
    set(OUTPUT_C ${PROTO_GEN_DIR}/${PROTO_FILENAME}.pb.c)
    set(OUTPUT_H ${PROTO_GEN_DIR}/${PROTO_FILENAME}.pb.h)

    add_custom_command(
        OUTPUT ${OUTPUT_C} ${OUTPUT_H}
        COMMAND ${PROTOC_COMPILER}
                --plugin=protoc-gen-nanopb=${NANOPB_GENERATOR}
                --nanopb_out=${PROTO_GEN_DIR}
                --proto_path=${CMAKE_CURRENT_LIST_DIR}/proto # Directory containing your .proto files
                ${PROTO_FILE}
        DEPENDS ${PROTO_FILE} ${NANOPB_GENERATOR} # Re-run if .proto or generator changes
        COMMENT "Generating C sources from ${PROTO_FILE}"
        VERBATIM
    )
    list(APPEND PROTO_GENERATED_SOURCES ${OUTPUT_C})
    list(APPEND PROTO_GENERATED_HEADERS ${OUTPUT_H})
endforeach()

# Add generated sources to your project's build
# This assumes your main executable target is 'app' (default for ESP-IDF)
target_sources(app PRIVATE ${PROTO_GENERATED_SOURCES})

# Add directory of generated headers to include paths
target_include_directories(app PRIVATE ${PROTO_GEN_DIR})

# Also ensure the nanopb runtime library is linked.
# If nanopb is a component, ESP-IDF's build system usually handles this.
# For example, if your nanopb component is named 'nanopb':
# target_link_libraries(app PRIVATE nanopb)
# Or, more commonly, in component CMakeLists.txt:
# register_component()
# Then in your main CMakeLists.txt, if nanopb is a public dependency of another component:
# idf_component_get_property(nanopb_include_dir nanopb INCLUDE_DIRS)
# target_include_directories(app PRIVATE ${nanopb_include_dir})
# idf_component_get_property(nanopb_lib nanopb LIBRARIES)
# target_link_libraries(app PRIVATE ${nanopb_lib})
# However, typically, if nanopb is a component correctly set up,
# just requiring it in your component's idf_component_register should suffice.
# For this example, ensure your main component requires nanopb:
# In main/CMakeLists.txt:
# idf_component_register(SRCS "your_main_app.c" ${PROTO_GENERATED_SOURCES}
#                     INCLUDE_DIRS "." ${PROTO_GEN_DIR}
#                     REQUIRES nanopb) # This is key!

Important: The CMakeLists.txt for your main component (main/CMakeLists.txt) should then be updated to include the generated sources and require nanopb. For example:

CMake

# In main/CMakeLists.txt
# (Assuming the above CMake code for generation is in project's main CMakeLists.txt)

# Get the list of generated sources (if not already available in this scope)
# This might be better handled by setting a global property or passing variables.
# For simplicity, if the generation commands are in the project-level CMakeLists.txt,
# PROTO_GENERATED_SOURCES will be available.

idf_component_register(SRCS "your_main_app_file.c" # Add your main C file(s)
                       # Add ${PROTO_GENERATED_SOURCES} if generation is in project CMakeLists.txt
                       # and not directly handled by target_sources(app ...)
                       INCLUDE_DIRS "." "${CMAKE_BINARY_DIR}/proto_gen" # Add generated headers path
                       REQUIRES nanopb) # This links against nanopb component library

If you put the generation logic inside main/CMakeLists.txt itself, then PROTO_GENERATED_SOURCES can be directly used in SRCS.

Tip: Create a proto subdirectory in your main component (e.g., main/proto/) to keep your .proto files organized.

Example 1: Simple Sensor Data Serialization/Deserialization

Let’s use the sensor.proto file defined earlier.

1. main/proto/sensor.proto:

(Content as shown in the Theory section)

2. main/proto/sensor.options (Optional, for nanopb customization):

This file allows you to specify constraints for nanopb. For example, to limit the length of the location string and the count of calibration_coeffs:

Protocol Buffers

SensorReading.location max_size:16
SensorReading.calibration_coeffs max_count:5

Place this file in the same directory as sensor.proto. nanopb_generator.py will automatically pick it up if the names match.

3. main/protobuf_example_main.c:

#include <stdio.h>
#include <string.h>
#include <freertos/FreeRTOS.h>
#include <freertos/task.h>
#include "esp_log.h"
#include "nvs_flash.h"

// Include generated protobuf header and nanopb core header
#include "sensor.pb.h" // This will be in your build/proto_gen/ directory
#include "pb_encode.h"
#include "pb_decode.h"

static const char *TAG = "PROTOBUF_EXAMPLE";

void protobuf_demo_task(void *pvParameters)
{
    // --- Serialization Example ---
    SensorReading sensor_data_tx = SensorReading_init_zero; // Initialize with defaults
    uint8_t tx_buffer[128]; // Buffer to hold serialized data
    pb_ostream_t tx_stream;

    // Populate the SensorReading message
    sensor_data_tx.timestamp = 1678886400000ULL; // Example timestamp
    sensor_data_tx.value = 25.7f;
    sensor_data_tx.type = SensorReading_SensorType_TEMPERATURE;
    strcpy(sensor_data_tx.location, "LivingRoom"); // Max 16 chars due to .options
    sensor_data_tx.has_location = true; // For optional scalar fields in proto3, use has_ 'if' you want to distinguish not set vs default. Nanopb generates this.
                                        // For strings and bytes, empty means not present if not using has_ field.
                                        // For nanopb, typically if a string field is empty, it's not encoded.
                                        // If you set a string, it's encoded. 'has_location' is more for scalar types
                                        // or if you need to explicitly differentiate empty string from not-set.
                                        // For simplicity with strings, just setting it is often enough.
                                        // Let's assume .options ensures location is always present if non-empty.
                                        // If no .options, nanopb might not generate has_location for string.

    sensor_data_tx.calibration_coeffs_count = 2;
    sensor_data_tx.calibration_coeffs[0] = 1.01f;
    sensor_data_tx.calibration_coeffs[1] = -0.5f;
    sensor_data_tx.battery_low = false;

    // Create an output stream for the buffer
    tx_stream = pb_ostream_from_buffer(tx_buffer, sizeof(tx_buffer));

    // Encode the message
    bool status = pb_encode(&tx_stream, SensorReading_fields, &sensor_data_tx);
    size_t message_length = tx_stream.bytes_written;

    if (!status) {
        ESP_LOGE(TAG, "Encoding failed: %s", PB_GET_ERROR(&tx_stream));
        vTaskDelete(NULL);
        return;
    }

    ESP_LOGI(TAG, "Successfully encoded message (%d bytes):", message_length);
    ESP_LOG_BUFFER_HEX(TAG, tx_buffer, message_length);


    // --- Deserialization Example ---
    SensorReading sensor_data_rx = SensorReading_init_zero; // Initialize to defaults
    pb_istream_t rx_stream;

    // Create an input stream from the buffer containing the serialized data
    rx_stream = pb_istream_from_buffer(tx_buffer, message_length);

    // Decode the message
    status = pb_decode(&rx_stream, SensorReading_fields, &sensor_data_rx);

    if (!status) {
        ESP_LOGE(TAG, "Decoding failed: %s", PB_GET_ERROR(&rx_stream));
        vTaskDelete(NULL);
        return;
    }

    ESP_LOGI(TAG, "Successfully decoded message:");
    ESP_LOGI(TAG, "Timestamp: %llu", sensor_data_rx.timestamp);
    ESP_LOGI(TAG, "Value: %.2f", sensor_data_rx.value);
    ESP_LOGI(TAG, "Type: %d (%s)", sensor_data_rx.type,
             sensor_data_rx.type == SensorReading_SensorType_TEMPERATURE ? "TEMPERATURE" :
             sensor_data_rx.type == SensorReading_SensorType_HUMIDITY ? "HUMIDITY" : "PRESSURE");

    // For strings, check if it was present. Nanopb typically null-terminates strings.
    // If using .options with max_size, the char array is fixed.
    // If not using .options for string size, it might be a char* needing callbacks or careful handling.
    // With max_size in .options, it's a char array.
    ESP_LOGI(TAG, "Location: %s", sensor_data_rx.location); // Assumes location is null-terminated by nanopb

    ESP_LOGI(TAG, "Calibration Coefficients (%d):", sensor_data_rx.calibration_coeffs_count);
    for (int i = 0; i < sensor_data_rx.calibration_coeffs_count; i++) {
        ESP_LOGI(TAG, "  Coeff %d: %.2f", i, sensor_data_rx.calibration_coeffs[i]);
    }
    ESP_LOGI(TAG, "Battery Low: %s", sensor_data_rx.battery_low ? "true" : "false");

    vTaskDelete(NULL);
}

void app_main(void)
{
    // Initialize NVS - boilerplate
    esp_err_t ret = nvs_flash_init();
    if (ret == ESP_ERR_NVS_NO_FREE_PAGES || ret == ESP_ERR_NVS_NEW_VERSION_FOUND) {
      ESP_ERROR_CHECK(nvs_flash_erase());
      ret = nvs_flash_init();
    }
    ESP_ERROR_CHECK(ret);

    xTaskCreate(protobuf_demo_task, "protobuf_demo_task", 4096, NULL, 5, NULL);
}

4. Build Instructions:

Ensure protoc is installed and in your PATH.
Ensure nanopb is correctly added as a component to your ESP-IDF project (e.g., in components/nanopb or via idf.py add-dependency).
Verify your CMake setup for protoc and nanopb_generator.py as described above.
Run:idf.py set-target esp32 # or your target idf.py build

5. Run/Flash/Observe:

Flash the firmware: idf.py -p /dev/ttyUSB0 flash monitor.
Observe the serial monitor output. You should see:
- Logs indicating successful encoding and the hex dump of the serialized data.
- Logs indicating successful decoding and the reconstructed data matching the original.
- The size of the encoded message, which should be significantly smaller than an equivalent JSON string.

Example 2: Using Protobuf with a Communication Protocol (Conceptual)

Protobuf is a serialization format, not a transport protocol. It’s commonly used with protocols like MQTT, CoAP, HTTP, or even raw TCP/UDP sockets.

Scenario: Sending SensorReading over MQTT.

Serialize: Your ESP32 collects sensor data, populates the SensorReading C struct, and serializes it into tx_buffer (resulting in message_length bytes) as shown in Example 1.
Transmit: This tx_buffer (of message_length bytes) becomes the payload of your MQTT message.
// Conceptual MQTT publish (assuming mqtt_client is initialized)
// esp_mqtt_client_publish(mqtt_client, "esp32/sensor/data", (const char *)tx_buffer, message_length, 1, 0);
Receive (on another device/server):
- The MQTT subscriber receives the byte array.
- It uses Protobuf (with the same .proto definition compiled for its language, e.g., Python, Java) to deserialize the bytes back into a SensorReading message object.

Similarly for CoAP:

The tx_buffer could be the payload for a CoAP PUT or POST request. The CoAP server would then deserialize it.

Benefit: The actual data sent over the network (tx_buffer) is compact, reducing bandwidth usage and transmission time.

Variant Notes

All ESP32 Variants (ESP32, ESP32-S2, ESP32-S3, ESP32-C3, ESP32-C6, ESP32-H2):
- Protocol Buffers, especially with nanopb, are well-suited for all ESP32 variants. nanopb‘s low resource footprint makes it viable even on single-core, memory-constrained variants like the ESP32-C3.
Performance:
- Encoding and decoding Protobuf messages involve some CPU overhead. More powerful cores (e.g., ESP32-S3, dual-core ESP32) will handle this faster than single-core variants.
- However, this CPU cost is often offset by the significant reduction in data size and parsing complexity compared to text-based formats, especially if network transmission is a bottleneck.
Memory (Flash/RAM):
- Flash: The nanopb runtime library is small. The generated .pb.c files will add to your code size, proportional to the complexity and number of your message definitions.
- RAM:
  - nanopb primarily uses stack memory for its operations and for storing the C structs representing your messages (if sizes are constrained via .options files).
  - Buffers for serialized/deserialized data need to be allocated (e.g., tx_buffer in the example). Their size depends on the maximum expected message size.
  - For very large or unbounded fields (strings, bytes, repeated fields) not constrained by .options, nanopb can use callbacks. This allows processing data in streams without buffering everything in RAM, but adds complexity to your application code.
Network Efficiency: The primary benefit across all variants is the reduction in payload size, which is particularly advantageous for battery-powered devices or those on metered/low-bandwidth networks (like LoRaWAN or NB-IoT, often bridged to IP protocols carrying Protobuf).

Common Mistakes & Troubleshooting Tips

Mistake / Issue	Symptom(s)	Troubleshooting / Solution
.proto Syntax Errors	`protoc` compiler fails with error messages pointing to lines in the `.proto` file. Build may halt.	Carefully validate `.proto` syntax (message definitions, field types, tags, enums) against official Protobuf documentation (proto3). Check for typos, missing semicolons, incorrect keywords.
protoc Not Found / Path Issues	Build error: `protoc: command not found` or CMake error `protoc compiler not found`.	Ensure `protoc` is installed on your development machine. Verify that the directory containing the `protoc` executable is in your system’s PATH environment variable. In CMake, ensure `find_program(PROTOC_COMPILER protoc)` is successful.
CMake Integration Failure	Generated `.pb.c`/`.pb.h` files not created, or not found by the compiler. Linker errors related to Nanopb functions or generated symbols.	Double-check paths in `CMakeLists.txt` for `nanopb_generator.py`, `.proto` files, and output directories (`PROTO_GEN_DIR`). Ensure `nanopb_generator.py` is executable. Verify `target_sources` and `target_include_directories` correctly list generated files/paths. Confirm your main component `REQUIRES nanopb` in `idf_component_register`.
Outdated Generated Files	Runtime behavior doesn’t match recent `.proto` file changes. Unexpected data or errors during serialization/deserialization.	Ensure your build system correctly regenerates `.pb.c` and `.pb.h` files when `.proto` files (or `.options` files) change. The `DEPENDS` clause in `add_custom_command` is crucial. Perform a clean build: `idf.py fullclean && idf.py build`.
Buffer Overflows / Truncation (Serialization)	`pb_encode` returns `false`. `PB_GET_ERROR(&stream)` might indicate “output buffer too small”. Data sent is incomplete.	Ensure the buffer provided to `pb_ostream_from_buffer` is large enough for the fully serialized message. Estimate max size or use Nanopb’s `pb_get_encoded_size()` if feasible (requires an extra pass). For strings/bytes/repeated fields in structs, if not using `.options` with `max_size`/`max_count`, ensure allocated memory for these pointers is sufficient or use callbacks.
Buffer Issues (Deserialization)	`pb_decode` returns `false`. `PB_GET_ERROR(&stream)` might indicate “unterminated varint”, “invalid wire type”, or “string too long”. Corrupted data after decoding.	When using `pb_istream_from_buffer`, ensure the provided `message_length` matches the actual size of the received Protobuf data. If using `.options` to define `max_size` for strings/bytes or `max_count` for repeated fields, ensure the incoming data doesn’t exceed these limits. If it does, decoding will fail. Consider callbacks for unbounded fields.
Incorrect `_count` for Repeated Fields	Serialization: Not all elements of a repeated field are encoded. Deserialization: `_count` is 0 or incorrect, leading to missed data or reading garbage.	Encoding: Before calling `pb_encode`, set the `fieldname_count` member of your struct to the actual number of elements you’ve populated in the array. Decoding: After `pb_decode`, check `fieldname_count` to know how many elements were successfully decoded into the array. Loop up to this count.
Handling Optional Scalar Fields (proto3) & `has_` fields	Confusion about whether a scalar field (int, bool, enum) was actually present in the message or just has its default value (e.g., 0 for int, false for bool). Nanopb generates `has_fieldname` for scalar fields if not explicitly disabled.	Encoding: If you set a scalar field, Nanopb typically encodes it. If you want to explicitly mark a scalar field as present (even if it’s the default value), set `your_message.has_fieldname = true;` before encoding. Decoding: After decoding, check `your_message.has_fieldname`. If `true`, the field was present in the stream. If `false`, it was not, and the field in your struct will hold its default value (e.g., 0, false). For strings and bytes, an empty string/byte array is typically how “not present” is handled if `has_` fields are not used or generated for them.
Mismatched `.proto` / `.options` Files	Compiler errors about unknown fields in `.options`, or runtime issues if `.options` constraints (e.g., `max_size`) don’t match the `.proto` definition used by the other end.	Ensure the `.options` file (e.g., `MyMessage.options`) corresponds exactly to the `MyMessage.proto` file it’s meant to customize. All communicating parties must use compatible `.proto` definitions. If one side uses `.options` to limit string size, the other side must be aware it might receive truncated data if it sends longer strings.

Exercises

Device Configuration Message:
- Define a .proto message named DeviceConfig with fields for Wi-Fi SSID (string), password (string), an update interval (uint32, in seconds), and a list of enabled sensor types (repeated enum, using SensorType from the chapter example).
- Write an ESP32 application that:
  - Initializes a DeviceConfig struct with sample data.
  - Serializes it to a buffer and logs the hex output.
  - Deserializes it back and prints the configuration.
- Use an .options file to set max_size for SSID and password, and max_count for enabled sensor types.
Protobuf with Callbacks for Large Data:
- Modify the SensorReading message to include a bytes diagnostic_log = 7; field.
- Research and implement nanopb callbacks for this diagnostic_log field.
- Serialization: Instead of having a large byte array in your C struct, use an encoding callback that provides data for diagnostic_log in chunks (e.g., read from a dummy source or generate it).
- Deserialization: Use a decoding callback that receives chunks of diagnostic_log data and processes them (e.g., prints them to the console) instead of buffering the entire log.
- This exercise demonstrates handling data larger than available RAM.
Error Handling and Validation:
- Take the first practical example (SensorReading).
- Introduce error conditions:
  - Try to encode into a buffer that is too small. Check the return status of pb_encode and log PB_GET_ERROR(&tx_stream).
  - Simulate corrupted Protobuf data (e.g., by slightly altering the tx_buffer after encoding) and try to decode it. Check the return status of pb_decode and log PB_GET_ERROR(&rx_stream).
- Implement basic validation after decoding (e.g., check if sensor_data_rx.value is within an expected range for the given sensor_data_rx.type).

Summary

Protocol Buffers (Protobuf) offer an efficient, strongly-typed, and language-neutral way to serialize structured data, ideal for resource-constrained IoT devices like the ESP32.
Data structures (messages) are defined in .proto files using a clear syntax.
The protoc compiler, along with a C-specific plugin like nanopb, generates C structs and encode/decode functions from .proto definitions.
nanopb is specifically designed for microcontrollers, offering low code size, minimal RAM usage, and no dynamic memory allocation by default.
Serialization (encoding) converts C structs into a compact binary format. Deserialization (decoding) reconstructs C structs from this binary data.
The binary wire format is compact due to techniques like Varint encoding and tag-based field identification, which also supports schema evolution.
Protobuf serialized data can be used as a payload in various communication protocols (MQTT, CoAP, HTTP, etc.), significantly reducing network bandwidth.
Proper CMake integration is crucial for automating the generation of Protobuf C files within an ESP-IDF project.

Chapter 119: Protocol Buffers on ESP32

Chapter Objectives

Introduction

Theory

What are Protocol Buffers?

Defining Messages (`.proto` files)

Compilation Process

Nanopb: Protobuf for Microcontrollers

Wire Format (Briefly)

Practical Examples

Setting up `nanopb` with ESP-IDF

Example 1: Simple Sensor Data Serialization/Deserialization

Example 2: Using Protobuf with a Communication Protocol (Conceptual)

Variant Notes

Common Mistakes & Troubleshooting Tips

Exercises

Summary

Further Reading

Leave a Comment Cancel Reply

Chapter 119: Protocol Buffers on ESP32

Chapter Objectives

Introduction

Theory

What are Protocol Buffers?

Defining Messages (.proto files)

Compilation Process

Nanopb: Protobuf for Microcontrollers

Wire Format (Briefly)

Practical Examples

Setting up nanopb with ESP-IDF

Example 1: Simple Sensor Data Serialization/Deserialization

Example 2: Using Protobuf with a Communication Protocol (Conceptual)

Variant Notes

Common Mistakes & Troubleshooting Tips

Exercises

Summary

Further Reading

Related Posts

Leave a Comment Cancel Reply

Defining Messages (`.proto` files)

Setting up `nanopb` with ESP-IDF