Chapter 90: Asynchronous Socket Programming in ESP-IDF

Chapter Objectives

After completing this chapter, students will be able to:

  • Understand the difference between blocking and non-blocking socket operations.
  • Recognize the limitations of blocking sockets in embedded applications.
  • Configure a socket for non-blocking I/O using fcntl().
  • Implement non-blocking connect(), send(), and recv() operations.
  • Understand the concept of I/O multiplexing.
  • Use the select() system call to monitor multiple socket descriptors for readability, writability, and error conditions.
  • Manage fd_set structures and related macros (FD_ZERO, FD_SET, FD_CLR, FD_ISSET).
  • Design event-driven network applications using select().
  • Handle common non-blocking I/O scenarios, including EAGAIN / EWOULDBLOCK errors.
  • Appreciate the benefits of asynchronous programming for creating responsive and scalable network applications on ESP32.

Introduction

In our previous discussions on socket programming, most examples utilized blocking socket calls. This means that when an operation like accept(), connect(), send(), or recv() is initiated, the application task pauses (blocks) until that operation completes or a timeout occurs (if one was set with SO_RCVTIMEO or SO_SNDTIMEO). While simple to program, this blocking behavior can be problematic in embedded systems like the ESP32, especially when managing multiple connections or needing to perform other tasks concurrently without resorting to a complex multi-tasking design for every connection.

Asynchronous socket programming, primarily through non-blocking sockets and I/O multiplexing, offers a more efficient way to handle network events. It allows a single task to manage multiple sockets, monitor them for activity, and react only when an operation can be performed without blocking. This leads to more responsive applications, better resource utilization, and simpler concurrency models in many cases.

This chapter will introduce you to the fundamentals of non-blocking I/O and the select() system call, a powerful tool for I/O multiplexing in ESP-IDF, enabling you to build sophisticated, event-driven network applications.

Theory

1. Blocking vs. Non-Blocking Sockets

a. Blocking Sockets (Default)

By default, sockets are created in blocking mode.

  • connect(): Blocks until a connection is established or an error occurs.
  • accept(): Blocks until an incoming connection request is received.
  • send()/sendto(): Blocks if the send buffer (e.g., TCP send window or LwIP’s internal pbufs) is full, until space becomes available.
  • recv()/recvfrom(): Blocks if no data is available in the receive buffer, until data arrives or the connection is closed.

While timeouts can be set using SO_SNDTIMEO and SO_RCVTIMEO to prevent indefinite blocking, the call still waits for that timeout period if the operation cannot complete immediately.

graph TD
    subgraph "Single Task Execution Flow"
        A["Task Initiates<br>Blocking Socket Call<br>e.g., recv()"] --> B{"Call Blocks Task"};
        B --> C{"Operation Can Complete?<br>(e.g., Data Arrives)"};
        C -- Yes --> D[Call Returns,<br>Task Resumes Processing];
        C -- No --> E[Task Remains Blocked<br>Waiting for Operation<br>or Timeout];
        D --> F[Proceed with Other Code];
        E --> F;
    end

    subgraph "Impact"
       Blocked[Other Events/Tasks<br>Handled by This Task<br>Are Stalled!];
       B -.-> Blocked;
    end

    classDef primary fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6;
    classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E;
    classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF;
    classDef endo fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46;
    classDef check fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B;

    class A primary;
    class B check;
    class C decision;
    class D endo;
    class E process;
    class F process;
    class Blocked check;

Problem: If a single task is responsible for multiple sockets or other duties, a blocking call on one socket can make the entire task unresponsive to other events or sockets. Creating a separate task for each connection can be resource-intensive on an ESP32 (memory for stacks, scheduling overhead).

b. Non-Blocking Sockets

A socket can be configured to be non-blocking. When a socket is non-blocking:

  • connect(): Initiates the connection and returns immediately. The application must then use a mechanism like select() to determine when the connection attempt succeeds or fails. If called on a non-blocking socket, it often returns -1 with errno set to EINPROGRESS.
  • accept(): Returns immediately. If no pending connections are present, it returns -1 with errno set to EAGAIN or EWOULDBLOCK.
  • send()/sendto(): If the data can be queued immediately (e.g., space in send buffer), it sends some or all data and returns the number of bytes sent. If the send buffer is full, it returns -1 with errno set to EAGAIN or EWOULDBLOCK.
  • recv()/recvfrom(): If data is available, it reads some or all of it and returns the number of bytes read. If no data is available, it returns -1 with errno set to EAGAIN or EWOULDBLOCK.

EAGAIN vs. EWOULDBLOCK: These error codes indicate that the operation cannot be completed immediately without blocking. On most systems, including LwIP, they are often the same value and can be used interchangeably in this context.

graph TD
    subgraph "Single Task Execution Flow"
        A["Task Initiates<br>Non-Blocking Socket Call<br>e.g., recv()"] --> B[Call Returns Immediately];
        B --> C{"Operation Possible Now?<br>(e.g., Data Available?)"};
        C -- Yes --> D[Operation Partially/Fully Completes<br>Returns Bytes Read/Sent];
        C -- No --> E[Returns -1, <br><b>errno</b> = <b>EAGAIN</b> / <b>EWOULDBLOCK</b>];
        D --> F[Task Processes Result/Data];
        E --> G[Task Can Perform Other Work<br>or Check Status Later];
        F --> G;
    end
    
    subgraph "Task State"
        Active[Task Remains Active<br>Can Handle Other Events];
        B -.-> Active;
    end

    classDef primary fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6;
    classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E;
    classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF;
    classDef success fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46;
    classDef check fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B;

    class A primary;
    class B process;
    class C decision;
    class D success;
    class E check;
    class F process;
    class G process;
    class Active success;

Using non-blocking sockets alone can lead to busy-waiting (repeatedly calling an operation in a loop until it succeeds), which is highly inefficient as it consumes CPU cycles. Therefore, non-blocking sockets are almost always used in conjunction with an I/O multiplexing mechanism.

Feature / Operation Blocking Sockets (Default) Non-Blocking Sockets
connect() Blocks task until connection is established or an error occurs (or timeout). Initiates connection and returns immediately. Often returns -1 with errno set to EINPROGRESS. Status checked later (e.g., with select()).
accept() Blocks task until an incoming connection request is received. Returns immediately. If no pending connections, returns -1 with errno set to EAGAIN or EWOULDBLOCK.
send() / sendto() Blocks task if the send buffer is full, until space becomes available (or timeout). If buffer has space, sends some/all data and returns bytes sent. If buffer is full, returns -1 with errno set to EAGAIN or EWOULDBLOCK.
recv() / recvfrom() Blocks task if no data is available in the receive buffer, until data arrives or connection is closed (or timeout). If data is available, reads some/all and returns bytes read. If no data, returns -1 with errno set to EAGAIN or EWOULDBLOCK.
Task Behavior Task pauses, cannot perform other duties while waiting for the socket operation. Task continues running, can perform other duties. Must check operation status later or react to readiness notifications.
Resource Usage Simpler to program for single operations. Managing multiple connections often requires multiple tasks, which can be resource-intensive (memory, scheduling). Allows a single task to manage multiple sockets. More complex logic but better resource utilization for concurrent operations.
Responsiveness Can lead to unresponsive application if one operation blocks for a long time. Enables more responsive applications as the main task doesn’t get stuck on I/O.
Typical Use with… Simple client/server operations where concurrency is not critical or handled by multi-tasking. Timeouts (SO_RCVTIMEO, SO_SNDTIMEO) can mitigate indefinite blocking. I/O multiplexing mechanisms like select() or poll() to efficiently manage multiple connections or operations.

2. Configuring Non-Blocking Mode: fcntl()

The fcntl() (file control) function is a standard POSIX call used to manipulate file descriptor properties, including setting a socket to non-blocking mode.

  • int fcntl(int fd, int cmd, ... /* arg */ );

To set a socket to non-blocking mode:

Step fcntl() Command (cmd) Argument (arg) Description
1. Get Current Flags F_GETFL 0 (or omitted for some fcntl versions, but 0 is safe for POSIX) Retrieves the current file status flags for the socket descriptor fd. Returns the flags on success, -1 on error.
2. Prepare New Flags N/A (Bitwise OR operation) current_flags | O_NONBLOCK Adds the O_NONBLOCK flag to the existing flags using a bitwise OR operation. This preserves other flags.
3. Set New Flags F_SETFL The new flags value (result from step 2). Applies the modified flags (including O_NONBLOCK) to the socket descriptor fd. Returns 0 on success, -1 on error.
C
#include <fcntl.h> // For fcntl, F_GETFL, F_SETFL, O_NONBLOCK
// ...
int flags = fcntl(sock_fd, F_GETFL, 0);
if (flags == -1) {
    ESP_LOGE(TAG, "fcntl F_GETFL failed: errno %d", errno);
    // Handle error
}
if (fcntl(sock_fd, F_SETFL, flags | O_NONBLOCK) == -1) {
    ESP_LOGE(TAG, "fcntl F_SETFL O_NONBLOCK failed: errno %d", errno);
    // Handle error
}
// sock_fd is now non-blocking

LwIP also provides a more direct, non-POSIX function lwip_fcntl() or fcntl() from its own include if you are working very close to LwIP internals, but using the standard fcntl from <fcntl.h> (which maps to LwIP’s implementation in ESP-IDF) is generally preferred for portability.

3. I/O Multiplexing: The Need for select()

I/O multiplexing allows a program to monitor multiple file descriptors (including sockets) to see if any of them are “ready” for a particular I/O operation (e.g., reading, writing) without blocking. The select() system call is a traditional and widely supported mechanism for this.

Analogy: Imagine a receptionist at a hotel (your single task) managing multiple phone lines (sockets).

  • Blocking: The receptionist picks up one phone line and talks. While on that call, they cannot answer other ringing lines or do other tasks.
  • Non-blocking without multiplexing (busy-waiting): The receptionist quickly picks up and puts down each phone line, one by one, asking “Anything yet?”. This is very tiring and inefficient.
  • I/O Multiplexing (select()): The receptionist has a switchboard that lights up when a phone line is ringing (readable), or when a previously initiated outgoing call connects (writable for connect), or if a line has an error. The receptionist waits for the switchboard to indicate activity, then handles only the lines that need attention.

4. The select() System Call

The select() function examines sets of file descriptors to see if any of them are ready for reading, writing, or have pending error conditions.

  • int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
    • nfds: This argument should be set to the highest-numbered file descriptor in any of the three sets, plus 1.
    • readfds: A pointer to an fd_set structure. On input, it specifies the set of file descriptors to be checked for readability (e.g., incoming data on recv, incoming connection on accept, completed non-blocking connect). On successful return, this set is modified to indicate which of these descriptors are actually readable. If you’re not interested in readability, pass NULL.
    • writefds: A pointer to an fd_set. On input, it specifies descriptors to check for writability (e.g., space in send buffer for send, completed non-blocking connect). On return, it’s modified to indicate which are writable. Pass NULL if not interested.
    • exceptfds: A pointer to an fd_set. On input, specifies descriptors to check for exceptional conditions (e.g., out-of-band data for TCP, though less commonly used for basic errors). On return, it’s modified. Pass NULL if not interested. (Note: Pending socket errors like those from a failed non-blocking connect are often reported via writefds or readfds readiness, and then SO_ERROR is checked.)
    • timeout: A pointer to a struct timeval that specifies the maximum interval select() should block waiting for a descriptor to become ready.
      • If timeout is NULL: select() blocks indefinitely until at least one descriptor is ready.
      • If timeout->tv_sec and timeout->tv_usec are both 0: select() returns immediately after checking the descriptors (polling).
      • If timeout points to a struct timeval with a non-zero value: select() blocks for up to the specified time.On return, timeout may be updated to reflect the remaining time.
    • Return Value:
      • On success, select() returns the total number of file descriptors that are ready across all three sets.
      • Returns 0 if the timeout expired before any descriptors became ready.
      • Returns -1 on error (with errno set). If select is interrupted by a signal (EINTR), this can also happen.
Parameter Type Input Description Output Behavior (On Success)
nfds int The highest-numbered file descriptor in any of the three sets, plus 1. N/A (Input only)
readfds fd_set * Pointer to an fd_set of descriptors to check for readability (e.g., incoming data, new connections, non-blocking connect completion). Pass NULL if not interested. Modified to indicate which descriptors in this set are actually readable.
writefds fd_set * Pointer to an fd_set of descriptors to check for writability (e.g., space in send buffer, non-blocking connect completion). Pass NULL if not interested. Modified to indicate which descriptors in this set are actually writable.
exceptfds fd_set * Pointer to an fd_set of descriptors to check for exceptional conditions (e.g., out-of-band data, some errors). Pass NULL if not interested. Modified to indicate which descriptors in this set have exceptional conditions.
timeout struct timeval * Pointer to a struct timeval specifying max blocking time:
  • NULL: Block indefinitely.
  • Timeval with 0s: Return immediately (poll).
  • Timeval with >0: Block for up to specified time.
May be updated to reflect the remaining time if the call didn’t timeout completely (behavior can vary by implementation, POSIX allows it).

fd_set and Related Macros

An fd_set (file descriptor set) is a data type used to store a collection of file descriptors. Several macros are used to manipulate these sets:

Macro Syntax Description
FD_ZERO FD_ZERO(fd_set *set); Initializes the file descriptor set set to be empty (clears all file descriptors from it). This must be done before adding any descriptors.
FD_SET FD_SET(int fd, fd_set *set); Adds the file descriptor fd to the set set. This indicates interest in monitoring this fd.
FD_CLR FD_CLR(int fd, fd_set *set); Removes the file descriptor fd from the set set. Used when you no longer want to monitor this fd.
FD_ISSET int FD_ISSET(int fd, fd_set *set); Returns a non-zero value (true) if fd is a member of the set set, and zero (false) otherwise. Used after select() returns to check which specific descriptors are ready.

Important: Because select() modifies the fd_set arguments to indicate readiness, you typically need to re-initialize these sets (e.g., with FD_ZERO and FD_SET) in a loop before each call to select().

5. Event-Driven Model with select()

A common pattern for using select() is an event loop:

  1. Initialize all necessary sockets (e.g., a listening socket for a server).
  2. Set these sockets to non-blocking mode if non-blocking operations are desired after select indicates readiness (especially for connect, accept, send, recv).
  3. Loop:
  • a. Initialize fd_sets (readfds, writefds, exceptfds) using FD_ZERO.
  • b. Add all active socket descriptors to the appropriate sets using FD_SET. For a listening socket, add it to readfds to check for incoming connections. For connected client sockets, add them to readfds to check for incoming data, and potentially writefds if you have data to send and want to check for writability.
  • c. Keep track of the maximum file descriptor number (nfds).
  • d. Call select() with the prepared sets and a timeout.
  • e. If select() returns > 0:
    • i. Iterate through all monitored descriptors.
    • ii. For each descriptor, use FD_ISSET() to check if it’s in the returned readfds, writefds, or exceptfds.
    • iii. If a descriptor is ready for a specific operation:
  • Listening socket readable: Call accept(). Add the new client socket to the set of monitored descriptors for future select() calls. Set the new socket to non-blocking.
  • Client socket readable: Call recv(). Handle received data. If recv returns 0 (connection closed by peer) or < 0 (error), close this client socket and remove it from monitored sets.
  • Client socket writable: Call send() to send pending data. If a non-blocking connect was in progress and the socket is now writable, the connection attempt has likely completed (succeeded or failed). Check SO_ERROR to confirm.
  • Handle exceptional conditions if exceptfds was used.
  • f. If select() returns 0: Timeout occurred. Perform any periodic tasks.
  • g. If select() returns -1: Handle error (e.g., EINTR).
graph TD
    A["Start: Initialize Sockets<br>e.g., Listening Socket(s)<br>Set Sockets to Non-Blocking"] --> B{Event Loop};
    B --> C["1- Prepare fd_sets:<br>FD_ZERO(readfds)<br>FD_ZERO(writefds)<br>FD_ZERO(exceptfds)"];
    C --> D["2- Populate fd_sets:<br>FD_SET(listen_sock, readfds)<br>For each client_sock:<br>  FD_SET(client_sock, readfds)<br>  If data_to_send:<br>    FD_SET(client_sock, writefds)"];
    D --> E["3- Determine max_fd + 1 (nfds)"];
    E --> F["4- Call select(nfds, &readfds, &writefds, &exceptfds, &timeout)"];
    F --> G{"select() returns"};

    G -- "> 0 (Sockets Ready)" --> H["5- Iterate Monitored Descriptors"];
    H --> I{"FD_ISSET(sock, readfds)?"};
    I -- Yes --> J["Handle Readable Socket:<br>- If listen_sock: accept() new connection, add to monitored set, set non-blocking<br>- If client_sock: recv() data, process, handle close/error"];
    J --> H;
    I -- No --> K{"FD_ISSET(sock, writefds)?"};
    K -- Yes --> L["Handle Writable Socket:<br>- If client_sock: send() pending data<br>- If non-blocking connect pending: check SO_ERROR, complete connection"];
    L --> H;
    K -- No --> M{"FD_ISSET(sock, exceptfds)?"};
    M -- Yes --> N[Handle Exceptional Condition];
    N --> H;
    M -- No --> H; 
    
    G -- "0 (Timeout)" --> O["Perform Periodic Tasks<br>(e.g., check shutdown flag, application timers)"];
    O --> B; 

    G -- "-1 (Error)" --> P["Handle select() Error<br>(e.g., log errno, EINTR?)"];
    P --> B; 

    J -.-> B; 
    L -.-> B; 
    N -.-> B; 
    H -. Loop Done .-> B; 

    Q["End (e.g., Shutdown Signal)"]
    B -. On Shutdown Condition .-> Q;


    classDef primary fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6;
    classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E;
    classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF;
    classDef check fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B;
    classDef success fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46;

    class A primary;
    class B decision;
    class C,D,E,J,L,N,O,P process;
    class F check;
    class G decision;
    class H process;
    class I,K,M decision;
    class Q success;

6. Non-Blocking connect()

A non-blocking connect() is a common use case with select():

  1. Create a socket.
  2. Set it to non-blocking mode using fcntl().
  3. Call connect(). It will likely return -1 with errno = EINPROGRESS. This is not a fatal error; it means the connection is being established in the background.
  4. Add the socket descriptor to writefds (and possibly readfds or exceptfds on some systems for error reporting, though writability is the most common indicator of completion) and call select().
  5. When select() indicates the socket is writable:a. The connection attempt has completed.b. To determine if it succeeded or failed, call getsockopt(sockfd, SOL_SOCKET, SO_ERROR, &error_val, &len).c. If error_val is 0, the connection is successful. The socket is now connected.d. If error_val is non-zero, the connection failed (e.g., ECONNREFUSED). Close the socket.
  6. Once connected, you can use select() with readfds for reading and writefds for writing on this socket.
graph TD
    A["1- Create Socket"] --> B["2- Set Non-Blocking<br>using fcntl()"];
    B --> C["3- Call connect(sock, ...)"];
    C --> D{"connect() returns -1 AND<br>errno == EINPROGRESS?"};
    
    D -- "Yes (In Progress)" --> E["4- Add sock to <b>writefds</b><br>(Optionally readfds/exceptfds for errors)"];
    E --> F["5- Call select(nfds, ..., &writefds, ..., &timeout)"];
    F --> G{"select() indicates sock is Writable?"};
    
    G -- "Yes" --> H["6a- Connection Attempt Completed.<br>Check for actual success/failure"];
    H --> I["Call getsockopt(sock, SOL_SOCKET, SO_ERROR, &err_val, ...)"];
    I --> J{"err_val == 0?"};
    J -- "Yes" --> K["6c- Connection Successful!"];
    J -- "No (err_val != 0)" --> L["6d- Connection Failed.<br>Error: strerror(err_val)<br>Close socket."];
    
    G -- "No (e.g., Timeout or Error in select)" --> M["Handle select() Timeout/Error.<br>Possibly retry or abort."];
    
    D -- "No (Immediate Error or Success)" --> N{"connect() returned 0?"};
    N -- "Yes (Immediate Success - Rare)" --> K;
    N -- "No (Immediate Failure)" --> O["Handle connect() Error:<br>errno is the error.<br>Close socket."];

    classDef primary fill:#EDE9FE,stroke:#5B21B6,stroke-width:2px,color:#5B21B6;
    classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1px,color:#92400E;
    classDef process fill:#DBEAFE,stroke:#2563EB,stroke-width:1px,color:#1E40AF;
    classDef success fill:#D1FAE5,stroke:#059669,stroke-width:2px,color:#065F46;
    classDef error fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B;
    classDef check fill:#FEE2E2,stroke:#DC2626,stroke-width:1px,color:#991B1B;


    class A,B,C,E,F,H,I process;
    class D,G,J,N decision;
    class K success;
    class L,M,O error;

Practical Examples

Example 1: Setting a Socket to Non-Blocking Mode

This snippet shows how to take an existing socket descriptor sock_fd and make it non-blocking.

C
#include <fcntl.h>
#include "lwip/sockets.h" // For socket functions if not already included
#include "esp_log.h"

static const char *TAG_NONBLOCK = "nonblock_socket";

esp_err_t make_socket_non_blocking(int sock_fd) {
    int flags = fcntl(sock_fd, F_GETFL, 0);
    if (flags == -1) {
        ESP_LOGE(TAG_NONBLOCK, "fcntl(F_GETFL) failed: errno %d", errno);
        return ESP_FAIL;
    }
    if (fcntl(sock_fd, F_SETFL, flags | O_NONBLOCK) == -1) {
        ESP_LOGE(TAG_NONBLOCK, "fcntl(F_SETFL, O_NONBLOCK) failed: errno %d", errno);
        return ESP_FAIL;
    }
    ESP_LOGI(TAG_NONBLOCK, "Socket %d set to non-blocking mode.", sock_fd);
    return ESP_OK;
}

// Usage:
// int my_socket = socket(...);
// if (my_socket >= 0) {
//     make_socket_non_blocking(my_socket);
// }

Example 2: Non-Blocking TCP Client connect() with select()

This example demonstrates a client attempting a non-blocking connection.

C
#include <string.h>
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "esp_log.h"
#include "lwip/err.h"
#include "lwip/sockets.h"
#include "lwip/sys.h"
#include <lwip/netdb.h>
#include <fcntl.h>

#define SERVER_IP   "192.168.X.X" // Replace with your server's IP
#define SERVER_PORT 8080

static const char *TAG_NB_CLIENT = "nonblock_client";

// (Includes make_socket_non_blocking from Example 1 or define it here)
esp_err_t make_socket_non_blocking(int sock_fd) { /* ... as above ... */ return ESP_OK; }


void non_blocking_client_task(void *pvParameters) {
    struct sockaddr_in dest_addr;
    dest_addr.sin_addr.s_addr = inet_addr(SERVER_IP);
    dest_addr.sin_family = AF_INET;
    dest_addr.sin_port = htons(SERVER_PORT);

    int sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    if (sock < 0) {
        ESP_LOGE(TAG_NB_CLIENT, "Unable to create socket: errno %d", errno);
        vTaskDelete(NULL);
        return;
    }
    ESP_LOGI(TAG_NB_CLIENT, "Socket created");

    if (make_socket_non_blocking(sock) != ESP_OK) {
        close(sock);
        vTaskDelete(NULL);
        return;
    }

    ESP_LOGI(TAG_NB_CLIENT, "Attempting non-blocking connect to %s:%d", SERVER_IP, SERVER_PORT);
    int err = connect(sock, (struct sockaddr *)&dest_addr, sizeof(dest_addr));

    if (err < 0) {
        if (errno == EINPROGRESS) {
            ESP_LOGI(TAG_NB_CLIENT, "Connection in progress...");
            fd_set writefds;
            struct timeval tv;

            FD_ZERO(&writefds);
            FD_SET(sock, &writefds);

            // Set timeout for select (e.g., 5 seconds)
            tv.tv_sec = 5;
            tv.tv_usec = 0;

            // Wait for socket to become writable (indicates connection completed or failed)
            int select_err = select(sock + 1, NULL, &writefds, NULL, &tv);

            if (select_err < 0) {
                ESP_LOGE(TAG_NB_CLIENT, "select() failed: errno %d", errno);
                close(sock);
                vTaskDelete(NULL);
                return;
            } else if (select_err == 0) {
                ESP_LOGW(TAG_NB_CLIENT, "select() timeout: Connection attempt timed out.");
                close(sock);
                vTaskDelete(NULL);
                return;
            } else { // select_err > 0
                if (FD_ISSET(sock, &writefds)) {
                    int so_error;
                    socklen_t len = sizeof(so_error);
                    if (getsockopt(sock, SOL_SOCKET, SO_ERROR, &so_error, &len) < 0) {
                        ESP_LOGE(TAG_NB_CLIENT, "getsockopt(SO_ERROR) failed: errno %d", errno);
                        close(sock);
                        vTaskDelete(NULL);
                        return;
                    }

                    if (so_error == 0) {
                        ESP_LOGI(TAG_NB_CLIENT, "Connection established successfully!");
                        // Socket is now connected, can proceed with send/recv
                        // For this example, just send a simple message
                        const char *msg = "Hello from non-blocking ESP32 client!";
                        send(sock, msg, strlen(msg), 0);
                    } else {
                        ESP_LOGE(TAG_NB_CLIENT, "Connection failed: SO_ERROR is %d (%s)", so_error, strerror(so_error));
                        close(sock);
                        vTaskDelete(NULL);
                        return;
                    }
                } else {
                     ESP_LOGE(TAG_NB_CLIENT, "select() returned, but socket not in writefds set unexpectedly.");
                }
            }
        } else { // Error other than EINPROGRESS
            ESP_LOGE(TAG_NB_CLIENT, "connect() failed immediately: errno %d (%s)", errno, strerror(errno));
            close(sock);
            vTaskDelete(NULL);
            return;
        }
    } else { // connect() returned 0, meaning immediate success (rare for non-blocking)
        ESP_LOGI(TAG_NB_CLIENT, "Connection established immediately (rare for non-blocking).");
        // Socket is now connected
    }

    // ... further operations on the connected socket ...
    ESP_LOGI(TAG_NB_CLIENT, "Closing socket.");
    close(sock);
    vTaskDelete(NULL);
}

Example 3: TCP Server with select() to Handle Multiple Clients

This server listens for connections and uses select() to handle incoming data from multiple clients without creating a task per client.

C
#include <string.h>
#include <sys/param.h> // For MAX/MIN (used by LwIP fd_set if not directly available)
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "esp_log.h"
#include "lwip/err.h"
#include "lwip/sockets.h"
#include "lwip/sys.h"
#include <lwip/netdb.h>
#include <fcntl.h>

#define SERVER_PORT 8080
#define MAX_CLIENTS 5 // Max concurrent clients for this simple example
#define RCV_BUFFER_SIZE 128

static const char *TAG_SELECT_SERVER = "select_server";

// (Includes make_socket_non_blocking from Example 1 or define it here)
esp_err_t make_socket_non_blocking(int sock_fd) { /* ... as above ... */ return ESP_OK; }

void select_server_task(void *pvParameters) {
    int listen_sock;
    int client_sockets[MAX_CLIENTS];
    fd_set readfds, masterfds; // masterfds keeps track of all active sockets
    int max_sd, activity, i, new_socket, valread;
    struct sockaddr_in server_addr, client_addr;
    socklen_t client_addr_len = sizeof(client_addr);
    char buffer[RCV_BUFFER_SIZE + 1];

    for (i = 0; i < MAX_CLIENTS; i++) {
        client_sockets[i] = 0; // 0 indicates an available slot
    }

    listen_sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    if (listen_sock < 0) {
        ESP_LOGE(TAG_SELECT_SERVER, "Socket creation failed: errno %d", errno);
        vTaskDelete(NULL);
        return;
    }
    make_socket_non_blocking(listen_sock); // Listen socket also non-blocking for accept

    server_addr.sin_family = AF_INET;
    server_addr.sin_addr.s_addr = htonl(INADDR_ANY);
    server_addr.sin_port = htons(SERVER_PORT);

    if (bind(listen_sock, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) {
        ESP_LOGE(TAG_SELECT_SERVER, "Bind failed: errno %d", errno);
        close(listen_sock);
        vTaskDelete(NULL);
        return;
    }
    ESP_LOGI(TAG_SELECT_SERVER, "Listener on port %d", SERVER_PORT);

    if (listen(listen_sock, 3) < 0) {
        ESP_LOGE(TAG_SELECT_SERVER, "Listen failed: errno %d", errno);
        close(listen_sock);
        vTaskDelete(NULL);
        return;
    }

    FD_ZERO(&masterfds);
    FD_SET(listen_sock, &masterfds);
    max_sd = listen_sock;

    ESP_LOGI(TAG_SELECT_SERVER, "Waiting for connections ...");

    while (1) {
        readfds = masterfds; // Copy master set, as select modifies it
        
        // No timeout for select, waits indefinitely for activity
        activity = select(max_sd + 1, &readfds, NULL, NULL, NULL);

        if (activity < 0 && errno != EINTR) {
            ESP_LOGE(TAG_SELECT_SERVER, "select error: errno %d", errno);
            // Potentially break or handle critical error
            continue; 
        }
        if (activity == 0) {
            ESP_LOGI(TAG_SELECT_SERVER, "select timeout (should not happen with NULL timeout)");
            continue;
        }

        // Check for incoming connection on listening socket
        if (FD_ISSET(listen_sock, &readfds)) {
            new_socket = accept(listen_sock, (struct sockaddr *)&client_addr, &client_addr_len);
            if (new_socket < 0) {
                if (errno != EAGAIN && errno != EWOULDBLOCK) {
                    ESP_LOGE(TAG_SELECT_SERVER, "accept failed: errno %d", errno);
                }
                // No new connection ready, continue
            } else {
                ESP_LOGI(TAG_SELECT_SERVER, "New connection: socket fd is %d, ip is: %s, port: %d",
                         new_socket, inet_ntoa(client_addr.sin_addr), ntohs(client_addr.sin_port));
                make_socket_non_blocking(new_socket);

                // Add new socket to array of sockets
                for (i = 0; i < MAX_CLIENTS; i++) {
                    if (client_sockets[i] == 0) {
                        client_sockets[i] = new_socket;
                        ESP_LOGI(TAG_SELECT_SERVER, "Adding to list of sockets as %d", i);
                        FD_SET(new_socket, &masterfds); // Add to master set
                        if (new_socket > max_sd) {
                            max_sd = new_socket;
                        }
                        break;
                    }
                }
                if (i == MAX_CLIENTS) {
                     ESP_LOGW(TAG_SELECT_SERVER, "Max clients reached. Rejecting new connection %d", new_socket);
                     send(new_socket, "Server busy. Try later.\r\n", strlen("Server busy. Try later.\r\n"), 0);
                     close(new_socket);
                }
            }
        }

        // Check for I/O on other client sockets
        for (i = 0; i < MAX_CLIENTS; i++) {
            int sd = client_sockets[i];
            if (sd > 0 && FD_ISSET(sd, &readfds)) { // Check if it's an active socket and ready for reading
                valread = recv(sd, buffer, RCV_BUFFER_SIZE, 0);
                if (valread == 0) { // Connection closed by client
                    ESP_LOGI(TAG_SELECT_SERVER, "Host disconnected: fd %d, ip %s, port %d",
                             sd, inet_ntoa(client_addr.sin_addr), ntohs(client_addr.sin_port)); // Note: client_addr here is from last accept
                    close(sd);
                    client_sockets[i] = 0; // Mark as free
                    FD_CLR(sd, &masterfds); // Remove from master set
                } else if (valread < 0) {
                    if (errno != EAGAIN && errno != EWOULDBLOCK) {
                        ESP_LOGE(TAG_SELECT_SERVER, "recv error on fd %d: errno %d", sd, errno);
                        close(sd);
                        client_sockets[i] = 0;
                        FD_CLR(sd, &masterfds);
                    }
                    // If EAGAIN/EWOULDBLOCK, means no data right now, do nothing.
                } else { // Data received
                    buffer[valread] = '\0';
                    ESP_LOGI(TAG_SELECT_SERVER, "Received from fd %d: %s", sd, buffer);
                    // Echo back the message
                    if (send(sd, buffer, valread, 0) != valread) {
                         ESP_LOGE(TAG_SELECT_SERVER, "send error on fd %d: errno %d", sd, errno);
                         // Handle send error, potentially close if severe
                    }
                }
            }
        }
    }
    // Cleanup (not reached in this example's infinite loop)
    close(listen_sock);
    vTaskDelete(NULL);
}

Build Instructions

  1. Create Project: Standard ESP-IDF project.
  2. Add Code: Place the chosen example code in main.c or a separate file.
  3. Network Setup: In app_main, initialize NVS, netif, event loop, and connect to Wi-Fi/Ethernet.
  4. Task Creation: Use xTaskCreate to start the non_blocking_client_task or select_server_task.
  5. LwIP Configuration:
    • Ensure LwIP is enabled.
    • The default CONFIG_LWIP_MAX_SOCKETS (usually 10 or 16) should be sufficient for these examples. If you plan to handle many more sockets with select, you might need to increase this and FD_SETSIZE considerations (though LwIP’s fd_set might be dynamically sized or use a bitmap up to MEMP_NUM_NETCONN or LWIP_MAX_SOCKETS). For ESP-IDF, FD_SETSIZE is typically tied to CONFIG_LWIP_MAX_SOCKETS.
  6. Build: idf.py build
  7. Flash: idf.py -p (PORT) flash
  8. Monitor: idf.py -p (PORT) monitor

Run/Flash/Observe Steps

  • Non-Blocking Client (Example 2):
    1. Set up a TCP server (e.g., netcat -l -p 8080 on your PC) at the SERVER_IP and SERVER_PORT defined in the client code.
    2. Flash and run the client on ESP32.
    3. Observe logs for “Connection in progress…”, then “Connection established successfully!” or a failure message.
    4. If successful, the server should receive “Hello from non-blocking ESP32 client!”.
  • Select Server (Example 3):
    1. Flash and run the server on ESP32.
    2. From one or more PCs, connect using netcat <ESP32_SERVER_IP> 8080 or Telnet.
    3. Send messages from clients. Observe the server log them and echo them back.
    4. The server should handle multiple clients concurrently within a single task.
    5. Test client disconnections.

Variant Notes

The fcntl() and select() APIs, as provided by LwIP through ESP-IDF, are standard and behave consistently across the ESP32, ESP32-S2, ESP32-S3, ESP32-C3, ESP32-C6, and ESP32-H2 variants.

  • LwIP Core Consistency: The underlying mechanisms for non-blocking I/O and I/O multiplexing are part of the LwIP core, ensuring uniform behavior.
  • Performance:
    • CPU load when using select() with many file descriptors can become a factor. Faster CPUs on variants like ESP32-S3 might handle larger fd_sets or more frequent select() calls with less overhead.
    • The efficiency of select() itself is generally good for the number of sockets typically managed by an ESP32.
  • FD_SETSIZE and Socket Limits:
    • The maximum number of file descriptors select() can monitor is traditionally limited by FD_SETSIZE. In ESP-IDF’s LwIP adaptation, this is typically related to CONFIG_LWIP_MAX_SOCKETS (default is often 10 or 16). If you need to monitor more sockets than this, you’d need to increase CONFIG_LWIP_MAX_SOCKETS via menuconfig and potentially ensure LwIP’s internal structures can accommodate this.
    • RAM availability on different variants will also influence how many active socket connections (and associated LwIP protocol control blocks – PCBs) can be realistically maintained.
  • Alternatives to select(): While select() is widely available, other I/O multiplexing mechanisms like poll() and epoll() (on Linux) exist. LwIP has some support for a poll()-like API (lwip_poll), but select() is the most commonly used and well-documented for basic I/O multiplexing in ESP-IDF examples. epoll is generally not available in LwIP.

For most ESP32 applications, select() provides a robust and adequate solution for asynchronous I/O. The choice of ESP32 variant will primarily affect how many concurrent connections can be handled smoothly due to RAM and overall processing capacity, rather than differences in the select() API itself.

Common Mistakes & Troubleshooting Tips

Mistake / Issue Symptom(s) Troubleshooting / Solution
Forgetting to Set Socket to Non-Blocking select() indicates readiness, but subsequent accept(), recv(), or send() calls still block unexpectedly. Application becomes unresponsive. Fix: Explicitly set listening and connected client sockets to non-blocking mode using fcntl(fd, F_SETFL, flags | O_NONBLOCK).
Incorrect nfds Argument in select() select() may return prematurely, miss events on higher-numbered file descriptors, or behave erratically. errno might be EBADF if nfds is too large and includes invalid fds (though less common for just being too low). Fix: nfds must be the highest socket descriptor value in any of the sets, plus one. Maintain a variable tracking the maximum sd added.
Not Re-initializing fd_sets Before Each select() Call After the first successful select() call, subsequent calls might behave incorrectly, either missing events or reporting stale events because select() modifies the sets. Fix: Always re-initialize the working fd_sets (e.g., readfds = masterfds; or using FD_ZERO and FD_SET for all relevant fds) inside the loop before each call to select().
Mishandling EINPROGRESS for Non-Blocking connect() Treating EINPROGRESS (returned by a non-blocking connect()) as a fatal error and closing the socket prematurely. Fix: EINPROGRESS is expected. After it occurs, use select() to monitor the socket for writability. Once writable, use getsockopt(sockfd, SOL_SOCKET, SO_ERROR, ...) to confirm if the connection succeeded (error code 0) or failed.
Busy-Waiting After Non-Blocking Call Returns EAGAIN/EWOULDBLOCK If a non-blocking recv() or send() returns EAGAIN, the code immediately retries in a tight loop, consuming excessive CPU. Fix: When EAGAIN or EWOULDBLOCK occurs, it means the operation would block. Do not spin. Rely on select() to notify when the socket is ready again for reading (add to readfds) or writing (add to writefds).
Incorrectly Handling recv() Return Value of 0 Ignoring or misinterpreting a return value of 0 from recv() on a TCP socket, potentially leading to infinite loops or incorrect state. Fix: A return value of 0 from recv() on a stream socket (TCP) indicates that the peer has gracefully closed its end of the connection. Your application should then close its socket and remove it from select() monitoring.
File Descriptor Exhaustion or Mismanagement Running out of available socket descriptors (EMFILE, ENFILE errors). select() might monitor incorrect or closed descriptors, leading to unpredictable behavior or EBADF. Fix: Implement robust cleanup. Always close(sd) sockets when they are no longer needed (client disconnects, unrecoverable errors). Also, remove them from any master fd_sets using FD_CLR(sd, &masterfds).

Tip: Use logging extensively. Log errno values when socket calls fail. Step through the select() loop logic carefully to understand which sockets are being added and checked.

Exercises

  1. Non-Blocking UDP Echo Server with select():
    • Adapt the TCP server example (Example 3) to work with UDP.
    • The server should create a single UDP socket, bind it, and set it to non-blocking.
    • Use select() to monitor this UDP socket for readability.
    • When data is received using recvfrom(), log it and send an echo back to the client’s address obtained from recvfrom().
    • Since UDP is connectionless, you won’t manage client sockets in an array like the TCP example, but rather handle each datagram as it arrives on the single server socket.
  2. Timeout Handling in select():
    • Modify the select_server_task (Example 3). Instead of a NULL timeout for select(), use a struct timeval to specify a timeout (e.g., 5 seconds).
    • If select() returns 0 (timeout), print a message like “No activity for 5 seconds, server is still alive.”
    • This demonstrates how select() can be used for periodic tasks in addition to I/O events.
  3. Graceful Shutdown of select() Server:
    • Add a mechanism to gracefully shut down the select_server_task (Example 3). For instance, use a FreeRTOS event group bit or a global flag that can be set by another task or a GPIO interrupt.
    • In the main select() loop, after select() returns (or on timeout), check this flag.
    • If the shutdown flag is set, break out of the loop, close all active client sockets, close the listening socket, and then delete the task.
  4. Error Handling for Non-Blocking send():
    • In the select_server_task (Example 3), when echoing data back to the client using send(), the send operation might not send all data in one go if the client’s receive buffer or network is slow (especially if the server’s socket send buffer becomes full).
    • Modify the echo part: If send() returns a value less than the amount of data you intended to send, or if it returns -1 with errno == EAGAIN or EWOULDBLOCK:
      • You’ll need to buffer the remaining unsent data.
      • Add the client socket descriptor to the writefds set for the next select() call.
      • When select() indicates this socket is writable, attempt to send the remaining buffered data.
    • This makes the echo more robust for non-blocking sends. (This is a more advanced exercise).

Summary

  • Asynchronous socket programming uses non-blocking sockets and I/O multiplexing to handle multiple network operations efficiently within a single task.
  • Sockets are set to non-blocking mode using fcntl(fd, F_SETFL, O_NONBLOCK).
  • Non-blocking operations (connect, accept, recv, send) return immediately. If they cannot complete, they return -1 with errno set to EAGAIN or EWOULDBLOCK (or EINPROGRESS for connect).
  • The select() system call is used to monitor multiple socket descriptors for readability, writability, or exceptional conditions without blocking indefinitely on a single one.
  • fd_set structures and macros (FD_ZERO, FD_SET, FD_CLR, FD_ISSET) are used to manage the sets of descriptors for select().
  • A common pattern is an event loop: prepare fd_sets, call select(), check FD_ISSET() for ready descriptors, and handle the corresponding I/O operations.
  • Non-blocking connect() requires checking SO_ERROR after select() indicates writability to confirm success.
  • This approach improves application responsiveness and resource utilization compared to blocking models or one-task-per-connection designs, especially on resource-constrained devices like ESP32.

Further Reading

  • ESP-IDF Programming Guide:
  • LwIP Project Documentation:
  • Books:
    • “Unix Network Programming, Vol. 1: The Sockets Networking API” by W. Richard Stevens, Bill Fenner, and Andrew M. Rudoff – Chapters on Non-blocking I/O and I/O Multiplexing (select and poll).
    • “TCP/IP Illustrated, Vol. 1: The Protocols” by W. Richard Stevens.
  • POSIX Standards:
    • select(3p) man page or POSIX specification for select.
    • fcntl(3p) man page or POSIX specification for fcntl.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top