Chapter 297: Certificate Management for IoT

Chapter Objectives

By the end of this chapter, you will be able to:

  • Explain the structure of a Public Key Infrastructure (PKI) and the concept of a “chain of trust.”
  • Describe the four key stages of a certificate’s lifecycle: creation, provisioning, rotation, and revocation.
  • Compare different strategies for provisioning device certificates at scale, including the highly scalable Just-in-Time Provisioning (JITP) model.
  • Implement a basic mechanism for remote certificate rotation on an ESP32.
  • Understand the importance and methods of certificate revocation.
  • Appreciate how advanced ESP32 hardware features facilitate secure and robust certificate management.

Introduction

In the previous chapter, we built a secure data pipeline using mutual TLS, relying on a unique certificate and private key to give our device a secure identity. This is perfect for a single device, but it immediately raises a critical question: how does this scale? How do you create, install, and manage unique identities for a fleet of ten, ten thousand, or ten million devices? The manual process of generating keys and embedding them in firmware is simply not feasible.

This is the domain of Certificate Management, a crucial discipline for any serious IoT deployment. It involves automating the entire lifecycle of a certificate, from its birth in a secure system to its eventual retirement. Without a robust certificate management strategy, a product cannot scale, its security will degrade over time, and it will be unable to recover from a potential device compromise.

This chapter will guide you through the architecture and best practices of managing certificates at scale. We will explore how large-scale systems handle provisioning and how to design firmware that can securely update its own identity over its operational life.

Theory

Effective certificate management is built on the foundation of a Public Key Infrastructure (PKI). A PKI is the combination of roles, policies, hardware, and software needed to create, manage, distribute, use, store, and revoke digital certificates.

1. The PKI Chain of Trust

A single device certificate is trusted because it is part of a “chain of trust” that originates from a highly protected Root Certificate Authority (CA).

  • Root CA: This is the ultimate source of trust in the PKI. Its certificate is self-signed. The Root CA’s primary role is to sign a small number of Intermediate CA certificates. Its private key is kept extremely secure, often in a hardware security module (HSM) that is kept offline.
  • Intermediate CA: This CA is trusted because its certificate was signed by the Root CA. It acts as a workhorse, signing the certificates for end-entities (like our ESP32 devices). Using intermediates adds a layer of security; if an intermediate CA is compromised, you can revoke it without having to distrust the entire root.
  • End-Entity (Device) Certificate: This is the certificate installed on the ESP32. It is trusted because it was signed by a trusted Intermediate CA.

This hierarchy allows a client (like an MQTT broker) to verify a device’s certificate by tracing its signature back up the chain to a Root CA that it already trusts.

graph TD
    subgraph "Public Key Infrastructure (PKI)"
    direction TB
    
    A[<b>Offline Root CA</b><br><i>Ultimate Trust Anchor</i><br>Self-Signed]
    B(<b>Online Intermediate CA</b><br>Signed by Root CA)
    C{<b>Device Certificate</b><br>Signed by Intermediate CA}
    D{<b>Device Certificate</b><br>Signed by Intermediate CA}
    E{<b>Device Certificate</b><br>Signed by Intermediate CA}
    
    A -- "Signs" --> B
    B -- "Signs" --> C
    B -- "Signs" --> D
    B -- "Signs" --> E

    end

    classDef root fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6
    classDef intermediate fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    classDef device fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46
    
    class A root
    class B intermediate
    class C,D,E device

2. The Certificate Lifecycle

Every certificate passes through a lifecycle of three stages.

sequenceDiagram
    participant D as ESP32 Device
    participant C as Cloud Provisioning Service
    participant R as Certificate Authority (CA)
    participant S as Device Registry

    D->>+C: 1. Connect with Bootstrap Cert
    C->>C: 2. Verify Bootstrap Cert
    C->>+R: 3. Request New Device Cert
    R-->>-C: 4. Generate & Sign Cert
    C->>+S: 5. Register New Cert & Device ID
    S-->>-C: 6. Acknowledge Registration
    C-->>-D: 7. Deliver Operational Cert & Key
    D->>D: 8. Store credentials in NVS
    Note over D: Disconnects & Deletes Bootstrap Identity
    D->>+C: 9. Reconnect with New Operational Cert
    C-->>-D: Connection Successful

  1. Creation & Provisioning (The “Birth”):This is the process of generating a unique identity and securely installing it on a device. How this is done is one of the most important architectural decisions you will make.
    • Pre-provisioning: The simplest method. You generate a batch of certificates and private keys and embed them into the firmware or flash them directly onto devices in the factory. This is secure but inflexible. If you need to produce more devices than you have certificates, you have to generate more and update the factory process.
    • Just-in-Time Provisioning (JITP): A far more scalable and flexible method, widely used by major cloud providers.
      1. Factory: All devices are flashed with the same generic “bootstrap” certificate. This certificate is from a special, restricted provisioning CA.
      2. First Boot: The device boots up, connects to a provisioning service in the cloud, and presents its bootstrap certificate.
      3. Onboarding: The cloud service verifies the bootstrap certificate. It then just-in-time generates a new, unique, long-term operational certificate for that specific device. It registers this new certificate and associates it with the device’s ID.
      4. Delivery: The cloud sends the new operational certificate and its private key back to the device.
      5. Activation: The device securely stores these new credentials (e.g., in encrypted NVS), disconnects, and reconnects using its new, permanent identity. The bootstrap certificate is never used again.
  2. Rotation (The “Renewal”):Certificates are intentionally created with a limited lifespan (e.g., 1-2 years). This is a security measure that limits the window of opportunity for an attacker if a private key is ever compromised. Certificate rotation is the process of replacing an expiring certificate with a new one before it becomes invalid. This is typically an automated process, often triggered by the cloud, where the device is instructed to generate a new key pair and a Certificate Signing Request (CSR), which it sends to the CA to be signed.
  3. Revocation (The “Retirement”):If a device is lost, stolen, or known to be compromised, its certificate must be immediately invalidated to prevent it from accessing your system. This is called revocation.
    • Cloud Registry Method: The simplest and most common method in IoT. You simply deactivate or delete the certificate from your cloud platform’s device registry (e.g., AWS IoT Registry). The broker will then reject any connection attempts from that certificate.
    • Traditional Methods: For more complex PKIs, Certificate Revocation Lists (CRLs) or the Online Certificate Status Protocol (OCSP) are used. These allow clients to check if a certificate’s serial number has been officially revoked by the CA.
graph LR
    subgraph "Certificate Lifecycle"
    
    A(<b>1. Creation & Provisioning</b><br>Device gets its initial identity)
    B(<b>2. Rotation</b><br>Identity is renewed periodically)
    C(<b>3. Revocation</b><br>Identity is invalidated if compromised)
    D((Active<br>Life))

    A -- "Securely Installed" --> D
    D -- "Nears Expiration" --> B
    B -- "New Certificate Issued" --> D
    D -- "Device Compromised" --> C
    
    end

    classDef creation fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6
    classDef rotation fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E
    classDef revocation fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B
    classDef active fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46

    class A creation
    class B rotation
    class C revocation
    class D active

Practical Example: Certificate Rotation from NVS

This example demonstrates a basic certificate rotation mechanism. The device will initially use certificates embedded in the firmware. It will listen on an MQTT topic for a “rotate” command containing new certificate data. It will save this data to NVS and reboot to use the new credentials.

1. Project Setup

  • Follow the setup from Chapter 296 to embed your initial ca.pemdevice.crt.pem, and device.key.pem files. These will be our “factory” certificates.
  • Enable NVS encryption in menuconfig (Component config -> NVS -> [*] Enable NVS encryption) to protect the new key we will store.

2. The Code

The core idea is to modify start_secure_mqtt to first check NVS for rotated certificates before falling back to the embedded ones.

C
#include "esp_log.h"
#include "esp_wifi.h"
#include "nvs_flash.h"
#include "nvs.h"
#include "esp_event.h"
#include "esp_mqtt_client.h"
#include "cJSON.h"

static const char *TAG = "CERT_MGMT";

// --- Embedded "Factory" Certificates ---
extern const uint8_t ca_pem_start[] asm("_binary_ca_pem_start");
extern const uint8_t device_crt_pem_start[] asm("_binary_device_crt_pem_start");
extern const uint8_t device_key_pem_start[] asm("_binary_device_key_pem_start");

// Assume mqtt_client is initialized and connected elsewhere
extern esp_mqtt_client_handle_t mqtt_client;
extern char device_id[];

// --- Rotation Handler ---
void handle_cert_rotation_command(const char *payload, int len) {
    ESP_LOGI(TAG, "Certificate rotation command received.");
    
    cJSON *root = cJSON_ParseWithLength(payload, len);
    if (root == NULL) {
        ESP_LOGE(TAG, "Failed to parse rotation JSON");
        return;
    }

    cJSON *new_cert_node = cJSON_GetObjectItem(root, "new_certificate");
    cJSON *new_key_node = cJSON_GetObjectItem(root, "new_private_key");

    if (!cJSON_IsString(new_cert_node) || !cJSON_IsString(new_key_node)) {
        ESP_LOGE(TAG, "Invalid JSON format for rotation.");
        cJSON_Delete(root);
        return;
    }

    nvs_handle_t nvs_handle;
    esp_err_t err = nvs_open("certs", NVS_READWRITE, &nvs_handle);
    if (err != ESP_OK) {
        ESP_LOGE(TAG, "Error opening NVS handle: %s", esp_err_to_name(err));
        cJSON_Delete(root);
        return;
    }
    
    // Write new certs to NVS
    nvs_set_str(nvs_handle, "client_cert", new_cert_node->valuestring);
    nvs_set_str(nvs_handle, "client_key", new_key_node->valuestring);
    nvs_commit(nvs_handle);
    nvs_close(nvs_handle);

    ESP_LOGW(TAG, "New certificates stored in NVS. Rebooting in 5 seconds...");
    vTaskDelay(pdMS_TO_TICKS(5000));
    esp_restart();
}

void start_secure_mqtt(void) {
    esp_mqtt_client_config_t mqtt_cfg = {
        .broker.address.uri = "mqtts://your-mqtt-broker-endpoint:8883",
        .broker.verification.certificate = (const char *)ca_pem_start, // CA is assumed to be static
    };

    // --- Load Rotated Certs from NVS if they exist ---
    nvs_handle_t nvs_handle;
    char *client_cert = NULL;
    char *client_key = NULL;
    size_t cert_len, key_len;

    if (nvs_open("certs", NVS_READONLY, &nvs_handle) == ESP_OK) {
        if (nvs_get_str(nvs_handle, "client_cert", NULL, &cert_len) == ESP_OK &&
            nvs_get_str(nvs_handle, "client_key", NULL, &key_len) == ESP_OK) {
            
            ESP_LOGI(TAG, "Found rotated certificates in NVS. Attempting to use them.");
            client_cert = malloc(cert_len);
            client_key = malloc(key_len);
            nvs_get_str(nvs_handle, "client_cert", client_cert, &cert_len);
            nvs_get_str(nvs_handle, "client_key", client_key, &key_len);

            mqtt_cfg.credentials.authentication.certificate = client_cert;
            mqtt_cfg.credentials.authentication.key = client_key;
        }
        nvs_close(nvs_handle);
    }
    
    // --- Fallback to Factory Certs ---
    if (client_cert == NULL) {
        ESP_LOGI(TAG, "No rotated certificates found. Using factory certificates.");
        mqtt_cfg.credentials.authentication.certificate = (const char *)device_crt_pem_start;
        mqtt_cfg.credentials.authentication.key = (const char *)device_key_pem_start;
    }

    esp_mqtt_client_handle_t client = esp_mqtt_client_init(&mqtt_cfg);
    // ... register event handler ...
    esp_mqtt_client_start(client);
    
    // IMPORTANT: Free the memory allocated from NVS after the client has used it.
    // The MQTT client makes its own copies of these strings during init.
    if (client_cert) free(client_cert);
    if (client_key) free(client_key);
}

/*
// In your MQTT event handler, subscribe to the rotation topic and add the handler
case MQTT_EVENT_CONNECTED:
    char rotation_topic[128];
    snprintf(rotation_topic, sizeof(rotation_topic), "devices/%s/certs/rotate", device_id);
    esp_mqtt_client_subscribe(mqtt_client, rotation_topic, 1);
    break;
case MQTT_EVENT_DATA:
    if (strncmp(event->topic, rotation_topic, event->topic_len) == 0) {
        handle_cert_rotation_command(event->data, event->data_len);
    }
    break;
*/

3. Build and Run

  1. Set up the project with your factory certs as in the previous chapter.
  2. Integrate the code above. The start_secure_mqtt replaces the previous version, and you need to add the rotation logic to your MQTT event handler.
  3. Generate a second set of device certificates (device_v2.crt.pemdevice_v2.key.pem).
  4. Flash and run. The device will connect using the factory certs.
  5. Using an MQTT client, publish a JSON payload to the devices/{your-device-id}/certs/rotate topic. The payload should contain the contents of your v2 certificate and key, for example: {"new_certificate": "-----BEGIN CERTIFICATE-----\n...", "new_private_key": "-----BEGIN RSA PRIVATE KEY-----\n..."}.
  6. Observe:
    • The device will log that it received the command and is rebooting.
    • After reboot, the log will show “Found rotated certificates in NVS…”.
    • The device will now attempt to connect using these new v2 certificates. If successful, it has rotated its identity.

Variant Notes

  • Secure Storage is Paramount: The example stores the rotated private key in NVS. This is only secure if NVS Encryption and Flash Encryption are enabled. This applies to all variants. Without them, an attacker could read the rotated key from the flash chip.
  • The Power of the Digital Signature (DS) Peripheral (ESP32-S2 and later): This is where newer variants provide a massive security upgrade. Instead of the cloud sending a new private key to the device (which is risky), a device with the DS peripheral can perform a much more secure rotation:
    1. The device receives a rotation command.
    2. It uses its hardware random number generator and the mbedtls library to generate a new key pair on-chip. The private key is generated directly inside the secure DS peripheral memory and is never exposed to software or RAM.
    3. The device generates a Certificate Signing Request (CSR) for the new public key.
    4. It sends this CSR to the cloud.
    5. The cloud CA signs the CSR and sends the new certificate back to the device.
    6. The device stores the new certificate and uses it along with the hardware-protected private key for future connections.This process means the private key is never transmitted over the network and never sits in flash memory, providing the highest level of security.
graph TD
    subgraph "On-Device (ESP32 with DS Peripheral)"
        A(<b>Receive Rotation Command</b><br>From MQTT Broker)
        B{<b>Generate New Key Pair</b><br>Private key created inside<br>and protected by the DS Peripheral.<br><i>Key never leaves hardware.</i>}
        C(<b>Generate CSR</b><br>Create Certificate Signing Request<br>using the new public key)
    end

    subgraph "Cloud Services"
        D[<b>Cloud CA</b>]
        E(<b>Sign CSR</b><br>CA verifies the request and<br>signs the new public key)
        F[<b>New Device Certificate</b>]
    end
    
    subgraph "On-Device (ESP32 with DS Peripheral)"
        G(<b>Store New Certificate</b><br>Save the new certificate to NVS)
        H((<b>Secure Connection</b><br>Uses new certificate from NVS<br>and private key from DS hardware))
    end

    A --> B
    B --> C
    C -- "Sends CSR to Cloud" --> E
    D -- "Signs with Intermediate Key" --> E
    E -- "Sends New Certificate to Device" --> F
    F --> G
    G --> H

    classDef start fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6
    classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF
    classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E
    classDef check fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B
    classDef success fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46

    class A start
    class B,C,E,G process
    class D,F check
    class H success

Common Mistakes & Troubleshooting Tips

Mistake / Issue Symptom(s) Troubleshooting / Solution
No Rotation Fallback Device enters a boot loop after a failed rotation attempt. Logs show repeated connection failures with NVS certificates. Implement a boot counter in RTC memory. If connection fails >3 times with NVS certs, erase the certs from NVS and reboot. This forces a fallback to the factory certificates.
Incorrect PEM Formatting in JSON Device receives rotation command but fails to parse the new certificate. Log shows Failed to parse rotation JSON or TLS handshake errors after reboot. Ensure newline characters (\n) are correctly escaped in the JSON string. Use a script or online tool to properly format the PEM content into a valid JSON string value.
No Certificate Expiration Monitoring A large number of devices suddenly fail to connect. The MQTT broker rejects them with authentication errors. Set up monitoring and alerts in your cloud dashboard. Track certificate expiration dates and trigger an automated rotation process 30-60 days before they expire.
NVS Not Encrypted Certificate rotation works, but an attacker with physical access can read the new private key from the flash chip. Enable NVS Encryption and Flash Encryption in menuconfig. This is critical to protect any sensitive data, especially private keys, stored in flash.
Private Key Sent Insecurely A new private key is sent to the device over an unencrypted channel (e.g., plain HTTP or MQTT). Only transmit keys over an authenticated, encrypted channel like mTLS. Better yet, use a hardware-secure method like the DS peripheral to generate keys on-device so they are never transmitted at all.

Exercises

  1. Implement Rotation Fallback: Add the fallback mechanism described in “Common Mistakes #2” to the practical example. The code should maintain a boot counter in RTC memory. If it reboots with NVS certs present and fails to connect, it should increment the counter. If the counter reaches 3, it should log an error, erase the NVS certificate keys, reset the counter, and reboot.
  2. On-device CSR Generation: (Advanced) Using the mbedtls library in ESP-IDF, write a function that does not take any input but generates a new 2048-bit RSA key pair and then creates and prints a standard Certificate Signing Request (CSR) to the console. This CSR is what a device would send to a CA to get a new certificate in a secure rotation or JITP flow.
  3. PKI Design Document: Imagine you are designing a new IoT product. Write a short (1-2 page) design document outlining your certificate management strategy. Which provisioning model will you use (pre-provisioning vs. JITP) and why? What will be the lifespan of your device certificates? How will you handle rotation and revocation?

Summary

  • Scalable IoT security is built on a Public Key Infrastructure (PKI), which provides a hierarchical chain of trust from a Root CA to each device.
  • Every certificate follows a lifecycle: secure provisioning, planned rotation, and potential revocation.
  • Just-in-Time Provisioning (JITP) is a highly scalable method where devices are provisioned with their unique, long-term identity upon their first connection to the cloud.
  • Certificate rotation is a critical process for renewing certificates before they expire, maintaining security over the device’s lifespan.
  • The Digital Signature (DS) peripheral on newer ESP32 variants enables the most secure form of certificate management by ensuring private keys never leave the chip’s secure hardware.
  • A robust management strategy must include fallback mechanisms and diligent monitoring of certificate expiration.

Further Reading

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top